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® (57) Abstract: There arc provided DNA constructs, including replicable cloning vectors and expression vectors, comprising a bac- 
Q teriophage promoter operably linked to an outron sequence. The expression vectors provided by the invention are useful in the 
expression of recombinant polypeptides in host cells or organisms and are particularly useful in expression of recombinant polypep- 
^ tides in nematode worms such as C elegans. 
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GENE EXPRESSION SYSTEM 

Field of the invention 

5 The invention relates to the expression of DNA, 

genes, cDNAs, proteins, peptides and parts thereof in 
the nematode worm C. elegans. In particular, the 
invention relates to methods of improving the 
translation of RNAs transcribed in C. eiegrans using a 
10 bacteriophage polymerase by introduction of a trans- 
splice recognition site recognised by an SL1 trans- 
splice recognition sequence into the DNA template 
transcribed by the bacteriophage polymerase. 

15 Background, to the invention 

Eukarvotic versus prokarvotic expression. 

Bacteriophage RNA polymerases, such as T7, T3, 
and SP6, and their corresponding promoters have been 

20 used extensively to drive the expression of 

heterologous genes in a variety of organisms. In co- 
pending International patent application No. WO 
00/01846, Plaetinck et al. describe the use of the T7 
system to express DNA, genes, cDNA, proteins and 

25 peptides of parts thereof and for the expression of 
double-stranded RNA (dsRNA) in the nematode model 
system C. elegans. 

The bacteriophage expression systems are well 
known in the art for use in prokaryotic host cells, 

30 such as £. coli, and have the advantage that they 

provide simple and strong expression systems dependent 
only on one RNA polymerase and one well defined 
promoter. The application of such efficient 
expression systems in eukaryotic organisms is, 

35 however, not evident, mainly because messenger RNAs 
from eukaryotes and prokaryotes have a different 
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structure, which has implications for translation 
efficiency and RNA stability. 

Messenger RNAs of higher eukaryotes share a 
functionally essential 5' CAP structure. This 
5 structure is generated during a capping reaction that 
is linked exclusively to RNA polymerase II 
transcription. Prokaryotic RNA polymerases such as 
bacteriophage T3, T7 and SP6 polymerases do not 
provide messenger RNAs with such a CAP structure, 
10 leading to inefficient translation in eukaryotic 
systems (Fuerst et ai. J. Mol. Biol : 206: 333-348 
(1989) ) . 

One way to improve translation of uncapped mRNAs 
in eukaryotic systems is by the insertion of an 

15 internal ribosome entry site (IRES) sequence 5* of the 
coding sequence. For example, Elroy-Stein, et a J . , 
Proc. Natl. Acad. Sci. USA 87:6743-6747 (1990), 
describe the cloning of the untranslated region of the 
ECMV virus downstream of the T7 promoter in order to 

20 enhance the efficiency of translation. In other 

" systems translation of T7-derived transcripts may be 
enhanced by addition of a CAP structure derived from a 
capped transcript. For example, in Trypanosoma a 5' 
CAP structure is added to T7 generated RNA transcripts 

25 by a natural occurring trans-splicing reaction (Wirtz 
et al. NAR 22:3887-3894 (1994)). 

Trans-splicing in C. eleaans. 

In C. elegans many mRNAs contain an identical 

30 short leader sequence,, designated the spliced leader 

(SL) . This splice leader is donated by a small RNA (SL 
RNA) via a trans-splicing reaction. This trans 
splicing was first observed by Krause et ai., Cell 
49:753-61 (1987). The splice leader RNA exists as a 

35 small nuclear ribonucleoprotein particle and has the 
trimethylguanosine cap that is characteristic of 
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eukaryotic small nuclear RNAs. The trimethylguanosine 
cap present on the spliced leader RNA is transferred 
to the pre-mRNA during the trans-splicing reaction. 
Thereafter, the trimethylguanosine cap is maintained 
5 on the mature mRNA (Van Doren et al., Mai. Cell. Biol. 
10:1769-1772 (1990). The trans-splicing signal for 
such a splice leader is essentially an intron missing 
only the 5' splice site, designated an 'outron' . An 
outron has essentially all the intron sequence 

10 including a trans-splice acceptor site homologous to a 
UUUCAG sequence preceded by a AU rich region (Conrad 
et al., NAR 21: 913-919 (1993). Introduction of an 
outron into the 5 1 untranslated region of a C. elegans 
gene converts it to a trans-spliced gene (Conrad et 

15 al., EMBO J. 12:1249-1255 (1993); Conrad et al. Mol. 
Cell Biol. 11:1931-1926 (1991)) and introduction of 
donor sites in a natural trans-spliced C. elegans gene 
prevents trans-splicing and converts it into a more 
conventional gene. 

20 

Description of the invention. 

Until recently, expression of heterologous and 
homologous genes in C. elegans was mainly achieved by 
linking an appropriate coding sequence to a selected 

25 C. elegans promoter. The present inventors have 
recently demonstrated that the recombinant gene 
expression in C. elegans can be based on the 
prokaryotic T7 expression system (WO 00/01846). 
However, the present inventors found that the 

30 expression system was far from being efficient, or at 
least the resulting expression was much lower than 
would be expected from this T7 related expression 
system. It was concluded that this low expression was 
mainly due to RNA instability or translation arrest. 

35 Furthermore, it was reasoned that fundamental 
differences between prokaryotic and eukaryotic 
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expression systems, particularly the requirement for 
capping of the 5 1 end of the mRNA for efficient 
translation in eukaryotic systems, was the main reason 
for this unexpectedly low expression. 
5 The inventors have now developed a solution to 

the problem of the inefficiency of the T7 system in 
eukaryotic host cells and organisms, particularly in 
C. elegans, and have constructed a generally 
applicable expression system which allows for the 
10 efficient expression of genes, DNA, cDNA, peptides and 
proteins under the regulation of the T7 promoter in C. 
elegans. 

Therefore, in accordance with a first aspect of 
the invention there is provided a DNA construct 

15 comprising a bacteriophage promoter operably linked to 
an outron sequence. 

It is an essential feature of the DNA construct 
of the invention that the bacteriophage promoter and 
the outron sequence are "operably linked", that is to 

20 say they are arranged in a relationship permitting 
them to function in their intended manner. In this 
case, the bacteriophage promoter is positioned 
upstream of the outron sequence such that it is 
capable of promoting transcription of the outron 

25 sequence upon binding of an appropriate RNA 

polymerase, with the outron sequence forming the 
extreme 5' end of the resulting transcript. 

The DNA construct may further comprise at least 
one restriction enzyme recognition site positioned 

30 downstream of and proximal to the outron sequence. 

Advantageously, the DNA construct may contain multiple 
restriction sites forming a multi-cloning site. The 
purpose of the restriction site/raulti-cloning site is 
to facilitate cloning of a heterologous or homologous 

35 DNA fragment downstream of the outron sequence. A DNA 
construct comprising a bacteriophage promoter, an 
outron sequence and a restriction site/multi-cloning 
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site may therefore be referred to hereinafter as an 
'outron cloning construct' . 

In an outron cloning construct it is advantageous 
for the restriction site/multi-cloning site to be 
5 positioned fairly proximal to the outron sequence 
(e.g. within lOObp) such that a heterologous or 
homologous sequence inserted at this site may be co- 
transcribed with the outron sequence on a single mRNA. 
However, further sequence elements may be interposed 

10 between the outron sequence and the restriction 

site/multi-cloning site. For example, the general 
purpose vector pDW3123 described in the accompanying 
examples has a synthetic intron A sequence between the 
outron sequence and the multi-cloning site. 

15 In one preferred embodiment of the invention, the 

DNA construct is a replicable cloning vector, such as, 
for. example, a plasmid vector. In addition to the 
bacteriophage promoter, outron sequence and optional 
restriction site/multi-cloning site, the vector may 

20 further contain one or more of the general features 
commonly found in cloning vectors, for example an 
origin of replication to allow autonomous replication 
within a host cell and a selective marker, such as an 
antibiotic resistance gene. Although not essential, 

25 the vector may also contain a poly-adenylation signal 
to stabilize and process the 3* end of the mRNA 
transcribed from the bacteriophage promoter. A 
preferred example is the 3 ' UTR from the C. elegans 
unc-54 gene, but any other 3' UTR or polyadenylation 

30 signal may be used. 

Outron-containing DNA constructs according to the 
invention may be easily be constructed from the 
component sequence elements using standard recombinant 
techniques well known in the art and described, for 

35 example, in F. M. Ausubel et al. (eds.), Current 

Protocols in Molecular Biology, John Wiley & Sons, 
Inc. (1994). 
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Outron sequences for use in the constructs of the 
invention may be isolated from natural C. elegans 
genes using standard molecular biology techniques. 
For example, a natural outron sequence might be 
5 amplified using the polymerase chain reaction or an 
equivalent amplification technique using C. 
elegans genomic DNA as a template. Alternatively, 
synthetic outron sequences may be synthesised, for 
example, by annealing two complementary single 

10 stranded oligonucleotides, as illustrated in the 

accompanying examples. Once a DNA fragment comprising 
the outron sequence has been obtained in would be a 
matter of routine to assemble an outron construct by 
linking the outron in the correct orientation relative 

15 to the bacteriophage promoter. 

The sequences of the commonly used bacteriophage 
promoters, e.g. T7, T3 and SP6, are well known in the 
art and oligonucleotides containing functional phage 
promoter sequences can be readily synthesised using 

20 standard oligonucleotide synthesis techniques. It 
would be a matter of routine to insert such a 
synthetic promoter sequence into, for example, a 
plasmid vector backbone containing, for example, an 
origin of replication a selective marker and a 

25 suitable restriction site. Alternatively, one of the 
many plasmid vectors containing bacteriophage promoter 
sequences known in the art may be used as the starting 
point for the construction of a plasmid-based outron 
cloning vector. The known vectors generally contain, 

30 in addition to the phage promoter sequence, one or 
more restriction sites conveniently positioned 
downstream of the phage promoter and also a bacterial 
origin of replication and a selective marker. Once 
the vector backbone is in place the outron sequence 

35 may simply be inserted in the appropriate position 
downstream of the bacteriophage promoter. 

In a particularly useful embodiment the invention 
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provides a DNA construct for use in bacteriophage 
promoter-driven expression of a polypeptide in a 
eukaryotic host cell or organism. This construct 
comprises a bacteriophage promoter operably linked to 
5 a DNA sequence such that it is capable of initiating 
transcription of the DNA sequence upon binding of an 
appropriate RNA polymerase to the promoter, wherein 
the aforesaid DNA sequence comprises an outron 
sequence and at least one open reading frame 

10 positioned downstream of the outron sequence. 

The open reading frame may be essentially any 
protein-encoding DNA sequence bounded by start and 
stop codons. This protein-encoding DNA sequence may 
include introns, as both trans-splicing and cis- 

15 splicing can occur together. 

A DNA construct according to this embodiment of 
the. invention, which may be referred to hereinafter as 
an "outron expression construct', may be derived from 
an outron cloning construct by insertion of a 

20 heterologous or homologous protein-encoding DNA 

fragment into the restriction site/raulti-cloning site- 
It is essential that the heterologous or homologous 
DNA fragment be inserted downstream of the outron 
sequence such that the two sequences may be co- 

25 transcribed, with the outron sequence forming part of 
the 5' untranslated region of the resulting mRNA. 

The outron expression construct may . 
advantageously form an expression vector, such as, for 
example, a plasmid vector. . Most preferably, the 

30 expression ;vector will be one suitable for use in the 
nematode worm C. elegans. In addition to the 
bacteriophage promoter, outron sequence and protein- 
encoding DNA sequence {open reading frame), the 
expression vector may further contain one or more of 

35 the general features commonly found in expression 

vectors, for example an origin of replication to allow 
autonomous replication within a bacterial host cell 
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and a selective marker, such as an antibiotic 
resistance. gene. The vector may also contain a 
poly-adenylation signal to stabilize and process the 
3 ' end of the mRNA transcribed from the bacteriophage 
5 promoter. A preferred example is the 3 ! UTR from the 
C. elegans unc-54 gene, but any other 3 1 UTR or 
polyadenylation signal may be used. An additional 
element, such as for example a synthetic intron, may 
be interposed between the outron sequence and the open 

10 reading frame. 

It is important that the open reading frame is 
positioned downstream of and proximal to the outron 
sequence in the expression construct such that (i) the 
two elements are co-transcribed to form a single mRNA 

15 and (ii) the outron sequence forms part of the 5 1 

untranslated region of the mRNA. If the appropriate 
splicing machinery and a supply of SL RNAs is provided 
by the eukaryotic host cell or organism then the 
uncapped 5' end of the pre-mRNA transcribed from the 

20 expression construct will be replaced with a capped 
splice leader via the trans-splicing reaction. This 
will greatly increase the efficiency of translation in 
a eukaryotic host system. 

The use of an outron sequence at the extreme 5' 

25 end of the RNA provides a solution to the problem of 
reduced expression efficiency in eukaryotic systems 
wherever the type of promoter/polymerase used to drive 
gene expression leads to the production of uncapped 
transcripts, provided that the host cell or organism 

30 produces the spliced leader RNAs required for the 
trans-splicing reaction. 

Outron sequences which may be utilised in 
accordance with the invention include naturally 
occurring outron^ sequences isolated from SLl-specific 

35 C. elegans genes (Conrad, R. Functional analysis of a 
C. elegans trans-splice acceptor. Nucleic Acids Res. 
1993, 21(4), pp913-919; Conrad, R. SL1 trans-splicing 
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specified by AU-rich synthetic RNA inserted at the 5' 
end of Caenorhabditis elegans pre-mRNA. RNA. 1995, 
1(2) , ppl64-170) and also synthetic outron sequences 
which are functionally equivalent to the natural C. 
5 elegans outron sequences, including variants of 

naturally occurring C. elegans outrons . The phrase 
"functionally equivalent" means that the synthetic 
intron is recognised by the C. elegans trans-splicing 
machinery and can be trans-spliced to a C. elegans 

10 splice leader RNA, preferably the SL1 splice leader. 

Experimental evidence indicates that trans- 
splicing in C. elegans is signalled by an AU-rich 
intron-like sequence followed by a splice acceptor 
site (Conrad et al 1993 and 1995) . For the purposes 

15 of the present application the terms "outron" or 

"outron sequence" should be interpreted as referring 
to both the AU-rich region from the 5 1 end of the pre- 
mRNA to the trans-splice acceptor site and the trans- 
splice acceptor site itself. In connection with the 

20 DNA constructs of the invention, the terms "outron" 

and "outron sequence" refer to features present in the 
DNA which encodes the pre-mRNA. 

The consensus splice acceptor site for trans- 
splicing of outrons and the consensus 3* splice 

25 acceptor site for cis-splicing of introns are 

essentially identical (UUUCAG) . Moreover, a normally 
trans-spliced acceptor site can be efficiently cis- 
spliced when a donor splice site is inserted upstream 
within the outron sequence. It is therefore important 

30 that the outron constructs described herein do not 

contain any potential splice donor sequence upstream 
of the splice acceptor within the outron and 
downstream of the transcription start site such that 
it will be transcribed in the mRNA encoded by the 

35 construct. If such a site were present than there 
would be a potential for cis-splicing rather than 
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trans-splicing. 

It has also been observed that the overall length 
of the outron has an effect on the efficiency of 
trans-splicing, longer outrons in general working 
5 better than shorter ones (Conrad et al. 1995> . 

Advantageously, the outron sequences for inclusion 
into the outron constructs described herein should be 
greater than about 50nt in length. 

A synthetic outron containing an AT stretch and a 

10 TTTTCAG sequence has been shown to be functional in C. 
elegans. As illustrated in the accompanying Examples, 
the insertion of an outron sequence into the 5* 
untranslated region of GFP reporter construct, 
downstream of the promoter and upstream of the GFP 

15 open reading frame, is required for optimal expression 
of bacteriophage RNA polymerase transcribed reporter 
gene mRNA in C. elegans. 

Suitable bacteriophage promoters which may be' 
used in the DNA constructs according to the invention 

20 include T7, T3 and SP6 promoters, with T7 being the 
most preferred. As discussed above, these 
bacteriophage promoters have long been known to be 
useful tools in molecular biology since they can 
provide simple and strong expression systems dependent 

25 only on the binding of the specific or cognate RNA 
polymerase . 

In a still further aspect, the invention provides 
a method for expressing a recombinant polypeptide in 
30 C. elegans f which method comprises: 

introducing an outron expression construct, as 
described above, said construct being an expression 
vector suitable for use in C. elegans, into a C. 
elegans strain which expresses an RNA polymerase 
35 specific for the bacteriophage promoter present in 
said DNA construct in one or more tissues or cell 
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types. 

An outron expression vector for use in this 
method may be constructed by inserting DNA encoding 
the polypeptide of interest into an outron cloning 
5 vector, as described above. The vector must be one 

which is suitable for use in C. elegans, plasmid-based 
vectors are the most preferred. 

The C. elegans worms are preferably transgenic 
worms carrying a transgene capable of expressing the 

10 RNA polymerase in one or more tissues or cell types. 
The term "transgene capable of expressing" as used 
herein means a nucleic acid molecule comprising a 
nucleotide sequence encoding the polymerase operably 
linked to a promoter. The promoter may be any 

15 promoter which functions in C. elegans and may be 

general (i.e. active in substantially all "tissues and 
cell types) , tissue-specific, cell type-specific, 
constitutive, inducible etc. Most preferably, the 
promoter will exhibit tissue or cell type-specif icity . 

20 With the use of a tissue or cell type-specific 

promoter of the appropriate' specificity it is possible 
to control the site of RNA polymerase expression 
within C. elegans and hence control the site of 
expression of the recombinant polypeptide. 

25 Methods for the construction of transgenic C. 

elegans worms are known in the art and are 
particularly described by Craig Mello and Andrew Fire, 
Methods in Cell Biology, Vol 48, Ed. H.F. Epstein and 
D.C. Shakes, Academic Press, pages 452-480. 

30 

In a further aspect the invention provides a kit 
for use in recombinant expression of a polypeptide in 
C. elegans, the kit comprising an outron cloning 
construct, as described above, and optionally a supply 
35 of C. elegans nematode worms expressing an RNA 

polymerase specific for the bacteriophage promoter 
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present in the said outron cloning construct in one or 
more tissues or cell types. 

The kit might further contain control inserts and 
control constructs, e.g. a reporter gene inserts and 
5 constructs which could be used to check efficiency of 
cloning steps and transfection steps, respectively. 
It might also contain constructs which may be used as 
selectable markers in the transfection procedure, e.g. 
a rol 6 plasmid (see below) . 

10 The invention further provides methods for the 

construction of transgenic C. elegans expressing a 
recombinant polypeptide in one or more tissues or cell 
types. One such method comprises introducing an 
outron expression construct, as described above, said 

15 construct being an expression vector suitable for use 
in C. elegans comprising an open reading frame 
encoding the desired recombinant polypeptide, into a 
C. elegans strain which expresses an RNA polymerase 
specific for the bacteriophage promoter present in 

20 said DNA construct in one or more tissues or cell 

types, and isolating transgenic C. elegans lines which 
stably express the said polypeptide. The C. elegans 
strain expressing the polymerase is preferably a 
transgenic strain carrying a transgene capable of 

25 expressing the RNA polymerase in one or more tissues 
or cell types, as described above. As aforesaid, 
transgenic C. elegans lines can readily be constructed 
using standard techniques well known in the art. 
In an alternative approach, the method may 

30 comprise introducing into a background C. elegans 
strain (i) an outron expression construct, as 
described above, said construct being an expression 
vector suitable for use in C. elegans comprising an 
open frame encoding the desired recombinant 

35 polypeptide, and (ii) a DNA construct suitable for 
expression of an RNA polymerase specific for the 
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bacteriophage promoter present in the outron 
expression construct in one or more tissues or cell 
types of C. elegans, and isolating transgenic C. 
eJegans lines which stably express the said 
5 polypeptide. The second DNA construct may, 

advantageously, be an expression vector comprising a 
nucleotide sequence encoding the polymerase operably 
linked to a promoter having the appropriate tissue or 
cell type specificity. 

10 In carrying out the methods of the invention one 

may employ standard techniques well known in the art 
for construction and selection of transgenic C. 
elegans lines. Such techniques are described, for 
example, in techniques described in Methods in Cell 

15 Biology, vol 84; Caenorhabdi tis elegans: modern 

biological analysis of an organism, ed. Epstein and 
Shakes, academic press, 1995. Foreign DNA {e.g. 
plasmid DNA) may be introduced into C. elegans using 
microinjection or ballistic transformation, as 

20 described in the applicant's co-pending International 
patent application No. WO 99/49066. In order to 
facilitate the selection of transgenic strains a 
marker plasmid may be co-introduced with the 
transgenes. A typical example is the plasmid pRF4 

25 (Mello, C. C. et al. EMBO J. 10, 3959-3970 (1991)) 
which carries the rol-6 gene. C. elegans expressing 
rol-6 can be identified by screening for the roller 
phenotype. Any other C. elegans dominant selectable 
phenotypic marker, of which there are many known in 

30 the art, may be used to facilitate selection of 
transgenic lines. A useful example is green 
fluorescent protein (or any of the equivalent 
autonomous fluorescent proteins known in the art) . 

In a still further aspect the invention provides 

35 transgenic C. elegans worms which contain an outron 
expression construct, as described above, said 
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construct being an expression vector suitable for use 
in C. elegans, and which further express an RNA 
polymerase specific for the bacteriophage promoter 
present in the outron expression construct in one or 
5 more tissues or cell types. 

The present invention will be further understood 
with reference to the following non-limiting Examples, 
together with the accompanying drawings in which: 

10 

Figure 1 illustrates the construction of a T7-outron- 
GFP vector. (A) sequence of the synthetic outron 
produced by annealing oligonucleotides o-GN59 and o- 
GN60. (B) summary of the strategy used to construct 
15 vector pDW3124. 

Figure 2 shows plasmid maps for pDW3123 (outron 
cloning vector) and pDW3124 (outron expression vector 
for GFP expression) . 

20 

Figure 3 is a plasmid map of pGN148 which contains a 
T7 RNA polymerase coding sequence under the regulation 
of the C. elegans SERCA promoter. 

25 Figure 4 illustrates the nucleotide sequence of 
pGN148. 

Figure 5 illustrates the nucleotide sequence of pDW 
3.123 annotated to show the positions of the T7 
30 promoter, outron, synthetic intron A, multi-cloning 
site and unc-54 3' UTR sequences and also the 
ampicillin resistance gene. 

Figure 6 illustrates the nucleotide sequence of pDW 
35 3124 annotated to show the positions of the T7 

promoter, outron, synthetic intron A, GFP with introns 
and unc-54 3' UTR sequences and also the ampicillin 



WO 01/88114 



PCT/EP01/05794 



- 15 - 

resistance gene. 

Example 1 - Construction of a T7-outron-GFP containing 
vector, (pDW3124) 
5 A SL1 trans-splice acceptor site (outron) was 

cloned into a vector downstream of the T7 promoter and 
upstream of the GFP to be expressed. 

A synthetic outron consisting of two partially 
overlapping oligonucleotides (o-GN59 and O-GN60, see 

10 Figure 1) was inserted into a Xbal/Xmal digested T7 

promoter GFP construct. Briefly, 25pl o-GN59 and 25ul 
O-GN60 (lOOuM) were denatured for 5 minutes at 94°C, 
annealed for 30 minutes at 68°C then cooled to 4°C. 
lul of Xmal/Xbal digested pDW3120 and lOul of the 

15 annealed oligos were then ligated using T4 ligase 

overnight at 16°C, transformed into competent E. coli 
and. analysed by restriction digestion and DNA 
sequencing, all according to standard molecular 
biology procedures. The resulting vector was 

20 designated pDW3124 {Figures 1 and 2) . 

The outron contains an AU rich sequence followed 
by a splice-acceptor site as described by Conrad efc 
al, NAR 21:913-919 (1993) (see Figure 1). 

25 Example 2 - Construction of a T7-Outron MCS vector 
A general purpose vector was constructed to 

facilitate expression of other DNA sequences in C. 

elegans under the control of the T7 promoter. This 

was done by digesting vector pDW3124 with Hindu 
30 (position 179) and PvuII (position 1029) (partial 

digest) and re-ligating the blunt ends, resulting in 

vector pDW3123 (Figure 2). 



WO 01/881 14 PCT/EP01/05794 

- 16 - 

Example 3 - The expression of heterologous genes in C. 
eleaans regulated by the T7 promoter requires 
trans-splicinq_._ 

Wild-type C. elegans nematodes where co-injected 
5 with various combinations of the following test 
plasmids : 

1) GFP reporter plasmid 
GFP: pDW2020 

10 outron-GFP: pDW2024 

T7 promoter-GFP: pDW3120 

T7 promoter-outron-GFP: pDW3124 

2) T7 polymerase expression plasmid SERCA T7 

15 polymerase: pGN148 together with pRF-4 (rol-6) as 
marker. 'I 

For every co-injection experiment, a total 
concentration of 200 ng DNA/pl was used (plasmid 
20 concentration was 50 ng/ul and carrier DNA was added 
up to 200ng/pl) . For every co-injection ±15 adult 
worms were injected. 

Fl offspring showing the marker rol-6 phenotype 
25 were isolated and then selected for further study. 
The next generation (F2) of the roller lines were 
screened for GFP expression in the pharynx, vulva, 
tail and body wall muscles. These are the tissues in 
which the bacteriophage T7 RNA polymerase is known to 
30 be expressed when under the control of the C. elegans 
SERCA promoter (as in the construct pGN148) 

The results are shown in Table 1 below, which 
indicates the number of lines expressing GFP vs total 
number of lines isolated. 
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1 


2 


3 


A 


Construct 


no T7- 

polymerase 

construct 


with T7-polymerase 
construct (SOng) 
pGN148 


B 


GFP (50ng) 
pDW2020 


0/8 


2/6* 


C 


outron: :GFP (50ng) 
P DW2024 


0/11 


3/8* 


D 


T7-proraoter : :GFP (50ng) 
PDW3120 


0/3 


0/5 


E 


T7-promoter :: outron :: GFP (50ng) 
pDW3124 


0/7 


13/13 



* GFP-expression most probably result of recombination 
10 in the extra chromosomal array 

Mo GFP expression was observed in the experiments 
where the T7 RNA polymerase was absent (cells B2, C2, 

15 D2, E2) . 

In the experiments where the T7 RNA polymerase 
expressing vector was co-injected with GFP vectors 
without a T7 promoter, as in the cells B3 and C3, GFP 
expression was sometimes observed. This is probably 

20 due to recombination events in the extrachrornosomal 
arrays, resulting in transcription of GFP directly 
from the SERCA promoter. 

In the experiments where the T7 promoter-GFP construct 
25 and the SERCA T7 RNA polymerase where co-injected, no 
GFP expression could be observed (cell D3) . In 
contrast, all of the lines isolated from the 
experiments where the GFP transcript contained an 
outron at its 5 1 site (n=13) expressed GFP (cell E3) . 
30 The outron is a favourable target for Shi 

trans-splicing. Since SL1 RNA molecules contain a 5' 
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trimethylguanosine CAP structure which is transferred 
to the mature mRNA this results in improved 
translation of the RNA and hence better expression of 
GFP. Without the outron the T7 RNA polymerase 
5 transcripts do not carry a CAP structure at their 5* 
end, leading to inefficient translation. The results 
of this experiment illustrate the importance of 
trans-splicing for efficient expression of 
heterologous and homologous genes transcribed by 
10 prokaryotic polymerases in C. elegans. 



SEQUENCE LISTING 



SEQ ID NO: 1 Oligonucleotide o-GN59 

15 SEQ ID NO: 2 Oligonucleotide 0-GN60 

SEQ ID NO: 3 Plasmid pDW3123 

SEQ ID NO: 4 Plasmid pDW3124 

SEQ ID NO: 5 Plasmid pGN148 
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Claims : 

1. A DNA construct comprising a bacteriophage 
promoter dperably linked to an outron sequence. 

2. A DNA construct as claimed in claim 1 which 
further comprises at least one restriction enzyme 
recognition site positioned downstream of and proximal 
to the outron sequence. 

3. A DNA construct as claimed in claim 2 which 
comprises a multi-cloning site positioned downstream 
of and proximal to the outron sequence. 

15 4. A DNA construct as claimed in claim 2 or 

claim 3 which further comprises a DNA fragment 
inserted at the said restriction site or at a 
restriction site within the said multi-cloning site. 

20 5. A DNA construct as claimed in any one of 

claims 1 to 4 which is a replicable cloning vector. 

6. A DNA construct as claimed in any one of 
claims 1 to 5 wherein the outron sequence comprises a 

25 3* splice acceptor site having the sequence TTTCAG 
preceded by an AT-rich region. 

7 . A DNA construct as claimed in claim 6 
wherein the outron sequence comprises the nucleotide 

30 sequence illustrated in Figure 1A. 

8. A DNA construct as claimed in any one of 
claims 1 to 7 wherein the bacteriophage promoter is 
the T7, T3 or SP6 promoter. 



35 



9. A DNA construct for use in bacteriophage 
promoter-driven expression of a polypeptide in a 
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eukaryotic host cell or organism, which construct 
comprises a bacteriophage promoter operably linked to 
a DNA sequence such that it is capable of initiating 
transcription of said DNA sequence upon binding of the 
5 appropriate RNA polymerase to the promoter, wherein 

the said DNA sequence comprises an outron sequence and 
at least one open reading frame positioned downstream 
of the outron sequence. 

10 10. A DNA construct as claimed in claim 9 which 

is an expression vector. 

11. A DNA construct as claimed in claim 9 or 
claim 10 wherein the outron sequence comprises a 3 T 

15 splice acceptor site having the sequence TTTCAG 
preceded by an AT-rich region. 

i 

i 

12. A DNA construct as claimed in claim 11 
wherein the outron sequence comprises the nucleotide 

20 sequence illustrated in Figure 1A. 

13. A DNA construct as claimed in any one of 
claims 9 to 12 wherein the bacteriophage promoter is 
the T7, T3 or SP6 promoter. 

25 

14. A kit for use in recombinant expression of a 
polypeptide in C. elegans, the kit comprising a DNA 
construct as claimed in any one of claims 1 to 3, and 
optionally C. elegans worms expressing an RNA 

30 polymerase specific for the bacteriophage promoter 

present in said DNA construct in one or more tissues 
or cell types. 

15. A method for expressing a recombinant 
35 polypeptide in C. elegans which method comprises: 

introducing a DNA construct as claimed in any one 
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of claims 9 to 13, said construct being an expression 
vector suitable for use in C. elegans, into a C. 
elegans strain which expresses an RNA polymerase 
specific for the bacteriophage promoter present in 
5 said DNA construct in one or more tissues or cell 
types . 

16. A method of generating transgenic C. elegans 
expressing a recombinant polypeptide, which method 

10 comprises: 

introducing a DNA construct as claimed in any one 
of claims 9 to 13 comprising an open reading frame 
encoding the recombinant polypeptide, said construct 
being an expression vector suitable for use in C. 

15 elegans r into a C. elegans strain which expresses an 

RNA polymerase specific for the bacteriophage promoter 
present in said DNA construct in one or more tissues 
or cell types, and 

isolating transgenic C. elegans lines which 

20 stably express the said polypeptide. 

17 . A method of generating transgenic C. elegans 
expressing a recombinant polypeptide, which method 

comprises : 

25 introducing into C. elegans (i) a first DNA 

construct as claimed in any one of claims 9 to .13 
comprising an open reading frame encoding the 
recombinant polypeptide, said construct being an 
expression vector suitable for use in C. elegans, and 

30 (ii) a second DNA construct suitable for expression of 
an RNA polymerase specific for the bacteriophage 
promoter present in the first DNA construct in one or 
more tissues or cell types of C. elegans, and 

isolating transgenic C. elegans lines which 

35 stably express the said polypeptide 
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18. Transgenic C. elegans which contain a DNA 
construct as claimed in any one of claims 9 to 13, 
said construct being an expression vector suitable for 
use in C. elegans, and which further express an RNA 
5 polymerase specific for the bacteriophage promoter 

present in said DNA construct in one or more tissues 
or cell types. 
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Xbal overhang Sspl 3 'splice acceptor 

CTAG ATTACAACTAATTATACTTATTTG AATATTO VAA TTTTCAG AC o -GK5 9 

T AATGTTGATT AAT ATG AAT AAA C TTATAAGTTT AAAAG TCTGGGCC o - GN6 0 

Xmal overhang 



synth. intron A 




GFP with mtrons 




amp 



unc-54 3* UTF 




XmaJ/Xbal digested pDW3 120 
25ul 0-GN59 + 25 pi O-GN60 (lOOuM) 
Denature oligos 0-GN59 & O-GN60 5 min. at 94°C 
Renaturate 30 min. at 68°C, cool to 4°C 

Ligate 1 ul vector + 10 pi oligos with T4 Hgase 
Overnight at 16°C 
Transform in E. coli 

Analyse by Restriction Digest and sequencing 



GFP with Introns 



Sac 1 



unc-54 3* LTTF 



amp 
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Nucleotide sequence of pGNl4 8 

atgactgctccaaagaagaagcgtaaggtaccggtaatgaacacgattaacatcgctaagaacgacttctc 
tgacatcgaactggctgctatcccgttcaacactctggctgaccattacggtgagcgtttagctcggtaag 
tttaaacatctagatactaactaacgattaacatttaaattttcagcgaacagttggcccttgagcatgag 
tcttacgagatgggtgaagcacgcttccgcaagatgtttgagcgtcaacttaaagctggtgaggttgcgga 
taacgctgccgccaagcctctcatcactaccctactccctaagatgattgcacgcatcaacgactggtttg 
aggaagtgaaagctaagcgcggcaagcgcccgacagccttccagttcctgcaagaaatcaagccggaagcc 
gtagcgtacatcaccattaagaccactctggcttgcctaaccagtgctgacaatacaaccgttcaggctgt 
agcaagcgcaatcggtcgggccattgaggacgaggctcgcttcggtcgtatccgtgaccttgaagccaagc 
acttcaagaaaaacgttgaggaacaactcaacaagcgcgtagggcacgtctacaagaaagcatttatgcaa 
gttgtcgaggctgacatgctctctaagggtctactcggtggcgaggcgtggtcttcgtggcataaggaaga 
ctctattcatgtaggagtacgctgcatcgagatgctcattgagtcaaccggagtggttagcttacaccgcc 
aaaatgctggcgtagtaggtcaagactctgagactatcgaactcgcacctgaatacgctgaggctatcgca 
acccgtgcaggtgcgctggctggcatctctccgatgttccaacctcgcgtagttcctcctaagccgtggac 
tggcattactggtggtggctattgggctaacggtcgtcgtcctctggcgctggtgcgtactcacagtaaga 
aagcactgatgcgctacgaagacgtttacatgcctgaggtgtacaaagcgattaacattgcgcaaaacacc 
gcatggaaaatcaacaagaaagtcctagcggtcgccaacgtaatcaccaagtggaagcattgtccggtcga 
ggacatccctgcgattgagcgtgaagaactcccgatgaaaccggaagacatcgacatgaatcctgaggctc 
tcaccgcgtggaaacgtgctgccgctgctgtgtaccgcaaggacaaggctcgcaagtctcgccgtatcagc 
cttgagttcatgcttgagcaagccaataagtttgctaaccataaggccatctggttcccttacaacatgga 
ctggcgcggtcgtgtttacgctgtgtcaatgttcaacccgcaagctaacgatatgaccaaaggactgctta 
cgctggcgaaaggtaaaccaatcggtaaggaaggttactactggctgaaaatccacggtgcaaactgtgcg 
ggtgtcgataaggttccgttccctgagcgcatcaagttcattgaggaaaaccacgagaacatcatggcttg 
cgctaagtctccactggagaacacttggtgggctgagcaagattctccgttctgcttccttgcgttctgct 
ttgagtacgctggggtacagcaccacggcctgagctataactgctcccttccgctggcgtttgacgggtct 
tgc'tctggcatccagcacttctccgcgatgctccgagatgaggtaggtggtcgcgcggttgtaagcttaaa 
ctctatcctactaactaacgaagcttatttaaattttcagaacttgcttcctagtgaaaccgttcaggaca 
tctacgggattgttgctaagaaagtcaacgagattctacaagcagacgcaatcaatgggaccgataacgaa 
gtagttaccgtgaccgatgagaacactggtgaaatctctgagaaagtcaagctgggcactaaggcactggc 
tggtcaatggctggcttacggtgt tact cgcagtgtgactaagcgttcagt cat gacgctggcttacgggt 
ccaaagagttcggcttccgtcaacaagtgctggaagataccattcagccagctattgattccggcaagggt 
ctgatgttcactcagccgaatcaggctgctggatacatggctaagctgatttgggaatctgtgagcgtgac 
ggtggtagctgcggttgaagcaatgaactggcttaagtctgctgctaagctgctggctgctgaggtcaaag 
ataagaagactggagagattcttcgcaagcgttgcgctgtgcattgggtcactccggatggtttccctgtg 
tggcaggaatacaagaagcctattcaaacgcgtttgaacctgatgttcctcggtcagttccgcttacagcc 
taccattaacaccaacaaagatagcgagattgatgcacacaaacaggagtctggtatcgctcctaactttg 
tacacagccaagacggtagccaccttcgtaagactgtagtgtgggcacacgagaagtacggaatcgaatct 
tttgcactgattcacgactccttcggtaccattccggctgacgctgcgaacctgttcaaagcagtgcgcga 
aactatggttgacacatatgagtcttgtgatgtactggctgatttctacgaccagttcgctgaccagttgc 
acgagtctcaattggacaaaatgccagcacttccggctaaaggtaacttgaacctccgtgacatcttagag 
tcggacttcgcgttcgcgtaagaattccaactgagcgccggtcgctaccattaccaacttgtctggtgtca 
aaaataataggggccgctgtcatcagagtaagtttaaactgagttctactaactaacgagtaatatttaaa 
ttttcagcatctcgcgcccgtgcctctgacttctaagtccaattactcttcaacatccctacatgctcttt 
ctccctgtgctcccaccccctatttttgttattatcaaaaaaacttcttcttaatttctttgttttttagc 
ttcttttaagtcacctctaacaatgaaattgtgtagattcaaaaatagaattaattcgtaataaaaagtcg 
aaaaaaattgtgctccctccccccattaataataattctatcccaaaatctacacaatgttctgtgtacac 
ttcttatgttttttttacttctgataaattttttttgaaacatcatagaaaaaaccgcacacaaaatacct 
tatcatatgttacgtttcagtttatgaccgcaatttttatttcttcgcacgtctgggcctctcatgacgtc 
aaatcatgctcatcgtgaaaaagttttggagtatttttggaatttttcaatcaagtgaaagtttatgaaat 
taattttcctgcttttgctttttgggggtttcccctattgtttgtcaagagtttcgaggacggcgcttttc 
ttgctaaaatcacaagtattgatgagcacgatgcaagaaagatcggaagaaggtttgggtttgaggctcag 
tggaaggtgagtagaagttgataatttgaaagtggagtagtgtctatggggtttttgccttaaatgacaga 
atacattcccaatataccaaacataactgtttcctactagtcggccgtacgggccctttcgtctcgcgcgt 
ttcggtgatgacggtgaaaacctctgacacatgcagctcccggagacggtcacagcttgtctgtaagcgga 
tgccgggagcagacaagcccgtcagggcgcgtcagcgggtgttggcgggtgtcggggctggcttaactatg 
cggcatcagagcagattgtactgagagtgcaccatatgcggtgtgaaataccgcacagatgcgtaaggaga 
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aaataccgcatcaggcggccttaagggcctcgtgatacgcctatttttataggttaatgtcatgataataa 
tggtttcttagacgtcaggtggcacttttcggggaaatgtgcgcggaacccctatttgtttatttttctaa 
atacattcaaatatgtatccgctcatgagacaataaccctgataaatgcttcaataatattgaaaaaggaa 
gagtatgagtattcaacatttccgtgtcgcccttattcccttttttgcggcattttgccttcctgtttttg 
ctcacccagaaacgctggtgaaagtaaaagatgctgaagatcagttgggtgcacgagtgggttacatcgaa 
ctggatctcaacagcggtaagatccttgagagttttcgccccgaagaacgttttccaatgatgagcacttt 
taaagttctgctatgtggcgcggtattatcccgtattgacgccgggcaagagcaactcggtcgccgcatac 
actattctcagaatgacttggttgagtactcaccagtcacagaaaagcatcttacggatggcatgacagta 
agagaattatgcagtgctgccataaccatgagtgataacactgcggccaacttacttctgacaacgatcgg 
aggaccgaaggagctaaccgcttttttgcacaacatgggggatcatgtaactcgccttgatcgttgggaac 
cggagctgaatgaagccataccaaacgacgagcgtgacaccacgatgcctgtagcaatggcaacaacgttg 
cgcaaactattaactggcgaactacttactctagcttcccggcaacaattaatagactggatggaggcgga 
taaagttgcaggaccacttctgcgctcggcccttccggctggctggtttatcgctgacaaatctggagccg 
gtgagcgtgggtctcgcggtatcattgcagcactggggccagatggtaagccctcccgtatcgtagttatc 
tacacgacggggagtcaggcaactatggatgaacgaaatagacagatcgctgagataggtgcctcactgat 
taagcattggtaactgtcagaccaagtttactcatatatactttagattgatttaaaacttcatttttaat 
ttaaaaggatctaggtgaagatcctttttgataatctcatgaccaaaatcccttaacgtgagttttcgttc 
cactgagcgtcagaccccgtagaaaagatcaaaggatcttcttgagatcccttttttctgcgcgtaatctg 
ctgcttgcaaacaaaaaaaccaccgctaccagcggtggtttgtttgccggatcaagagctaccaactcttt 
ttccgaaggtaactggcttcagcagagcgcagataccaaatactgtccttctagtgtagccgtagttaggc 
caccacttcaagaactctgtagcaccgcctacatacctcgctctgctaatcctgttaccagtggctgctgc 
cagtggcgataagtcgtgtcttaccgggttggactcaagacgatagttaccggataaggcgcagcggtcgg 
gctgaacggggggttcgtgcacacagcccagcttggagcgaacgacctacaccgaaccgagatacctacag 
cgtgagcattgagaaagcgccacgcttcccgaagggagaaaggcggacaggtatccggtaagcggcagggt 
cggaacaggagagcgcacgagggagcttccagggggaaacgcctggtatctt tatagtcctgtcgggtttc 
gccacctctgacttgagcgtcgatttttgtgatgctcgtcaggggggcggagcctatggaaaaacgccagc 
aacgcggcctttttacggttcctggccttctgctggccttttgctcacatgttctttcctgcgtcatcccc 
tgattctgtggataaccgtattaccgcctttgagtgagctgataccgctcgccgcagccgaacgaccgagc 
gcagcgagtcagtgagcgaggaagcggaagagcgcccaatacgcaaaccgcctctccccgcgcgttggccg 
attcattaatgcagctggcacgacaggtttcccgactggaaagcgggcagtgagcgcaacgcaattaatgt 
gagttagctcactcattaggcaccccaggctttacactttatgcttccggctcgtacgttgtgtggaattg 
tgagcggataacaatttcacacaggaaacagctatgaccatgattacgccaagctgtaagtttaaacatga 
tcttactaactaactattctcatttaaattttcagagcttaaaaatggctgaaatcactcacaacgatgga 
tacgctaacaacttggaaatgaaataagcttgcatgcctgcagagcaaaaaaatactgcttttccttgcaa 
aattcggtgctttcttcaaagagaaacttttgaagtcggcgcgagcatttccttcttcgacttctctcttt 
ccgccaaaaagcctagcatttttattgataatttgattacacacactcagagttcttcgacatgataaagt 
gtttcattggcactcgccccaacagtacatgacaagggcggattattatcgatcgatattgaagacaaact 
ccaaatgtgtgctcattttggagccccgtgtggggcagctgctctcaatatattactagggagacgaggag 
ggggaccttatcgaacgtcgcatgagccattctttcttctttatgcactctcttcactctctcacacatta 
atcgattcatagactcccacattccttgatgaaggtgtgggtttttagctttctttcccgatttgtaaaag 
gaagaggctgacgatgttaggaaaaagagaacggagccgaaaaaacatccgtagtaagtcttccttttaag 
ccgacactttttagacagcattcgccgctagttttgaagtttaaattttaaaaaataaaaattagtttcaa 
ttttttttaattactaaataggcaaaagttttttcaagaactctagaaaaactagcttaattcatgggtac 
tagaaaaattcttgttttaaatttaatatttatcttaagatgtaattacgagaagcttttttgaaaattct 
caattaaaagaatttgccgatttagaataaaagtcttcagaaatgagtaaaagctcaaattagaagtttgc 
ttttaaaggaaaaacacgaaaaaagaacactatttatcttttcctccccgcgtaaaactagttgttgtgat 
aatagtgatccgctgtctacttgcactcggctcttcacaccgtgcttcctctcacttgacccaacaggaaa 
aaaaaacatcacgtctgagacggtgaattgccttatcaagagcgtcgtctctttcacccagtaacaaaaaa 
aatttggtttctttactttatatttatgtaggtcacaaaaaaaaagtgatgcagttttgtgggtcggttgt 
ctccacaccacctccgcctccagcagcacacaatcatcttcgtgtgttctcgacgactccttgtatgccgc 
ggtcgtgaatgcaccacattcgacgcgcaactacacaccacactcactttcggtggcattactacacgtca 
tcgttgttcgtagtctcccgctctttcgtccccactcactcctcattatcccccttggcgtattgatttct 
tttaaatggtacaccactcctgacgtctctaccttcttgttttccgtcca^ttagatbctatctggaaatt 
tttttaaaattttaggccagagagttctagttcttgttctaaaagtctaggtcagacatacattttctaCt 
tctcatcaaaaaaaaagttgataaagaaaactggttatfccagaaagagtc-gtctcgccgaaattgattca 
aaaaaaaattcccacccctcgcttgtttctcaaaatatgagatcaacggartttttccttctcgactcaat 
tttttgctgcgctctgtctgccaaagtgtgtgtgtccgagcaaaagatgagagaatttacaaacagaaatg 
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aaaaaaagttggccaaataatgaagttttatccgagattgatgggaaagatattaatgttctttacggttt 
ggaggggagagagagatagattttcgcatcaaactccgccttttacatgtcttttagaatctaaaatagat 
ttttctcatcatttttaatagaaaatcgagaaattacagtaatttcgcaattttcttgccaaaaatacacg 
aaatttgtgggtctcgccacgatctcggtcttagtggttcatttggtttaaaagtttataaaatttcaaat 
tctagtgtttaatttccgcataattggacctaaaatgggtttttgtcatcattttcaacaagaaatcgtga 
aaatcctgttgtttcgcaattttcttttcaaaaatacacgaaatatatggtaatttcccgaaatattgagg 
gtctcgccacgatttcagtcacagtggccaggatttatcacgaaaaaagttcgcctagtctcacatctccg 
gaaaaccgaatctaaattagttttttgtcatcattttgaacaaaaaatcgagacatccctatagtttcgca 
attttcgtcgcttttctctccaaaaatgacagtctagaattaaaattcgctggaactgggaccatgatatc 
ttttctccccgtttttcattttattttttattacactggattgactaaaggtcaccaccaccgccagtgtg 
tgccatatcacacacacacacacacacaatgtcgagattttatgtgttatccctgcttgatttcgttccgt 
tgtctctctctctctattcatcttttgagccgagaagctccagagaatggagcacacaggatcccggcgcg 
cgatgtcgtcgggagatggcgccgcctgggaagccgccgagagatatcagggaagatcgtctgatttctcc 
tcggatgccacctcatctctcgagtttctccgcctgttactccctgccgaacctgatatttcccgttgtcg 
taaagagatgtttttattttactttacaccgggtcctctctctctgccagcacagctcagtgttgcctgtg 
tgctcgggctcctgccaccggcggcctcatcttcttcttcttcttctcccctgctctcgcttatcacttct 
tcattcattcttattccttttcatcatcaaactagcatttcttactttatttatttttttcaattttcaat 
tttcagataaaaccaaactacttgggttacagccgtcaacagatccccgggattggccaaaggacccaaag 
gtatgtttcgaatgatactaacataacatagaacattttcaggaggacccttgcttggagggtaccgagct 
cagaaaaa 



W.O01/8SU4 



PCT/EPO 1/05794 



T7 promoter Outron 



1 AGCTTGGCGC CTAATACGAC TCACTATAGG GCTGCAGGTC GACTCTAGAT TACAACTAAT TATACTTATT 
TCGAACCGCG GATTATGCTG AGTGATATCC CGACGTCCAG CTGAGATCTA ATGTTGATTA ATATGAATAA 

Outron synth. intron A 

71 TGAATATTCA AATTTTCAGA CCCGGGATTG GCCAAAGGAC CCAAAGGTAT GTTTCGAATG ATACTAACAT 
ACTTATAAGT TTAAAAGTCT GGGCCCTAAC CGGTTTCCTG GGTTTCCATA CAAAGCTTAC TATGATTGTA 

synth. intron A MCS 

141 AACATAGAAC ATTTTCAGGA GGACCCTTGG CTAGCGTCCT GCTGGGATTA CACATGGCAT GGATGAACTA 
TTGTATCTTG TAAAAGTCCT CCTGGGAACC GATCGCAGGA CGACCCTAAT GTGTACCGTA CCTACTTGAT 

unc-54 3» UTR 

211 TACAAATAGG GCCGGCCGAG CTCCGCATCG GCCGCTGTCA TCAGATCGCC ATCTCGCGCC CGTGCCTCTG 
ATGTTTATCC CGGCCGGCTC GAGGCGTAGC CGGCGACAGT AGTCTAGCGG TAGAGCGCGG GCACGGAGAC 



unc-54 



281 ACTTCTAAGT CCAATTACTC TTCAACATCC CTACATGCTC TTTCTCCCTG TGCTCCCACC CCCTATTTTT 
TGAAGATTCA GGTTAATGAG AAGTTGTAGG GATGTACGAG AAAGAGGGAC ACGAGGGTGG GGGATAAAAA 

unc-54 3» UTR 



351 GTTATTATCA AAAAAACTTC TTCTTAATTT CTTTGTTTTT TAGCTTCTTT TAAGTCACCT CTAACAATGA 
CAATAATAGT TTTTTTGAAG AAGAATTAAA GAAACAAAAA ATCGAAGAAA ATTCAGTGGA GATTGTTACT 

unc-54 3' UTR 



421 AATTGTGTAG ATTCAAAAAT AGAATTAATT CGTAATAAAA AGTCGAAAAA AATTGTGCTC CCTCCCCCCA 
TTAACACATC TAAGTTTTTA TCTTAATTAA GCATTATTTT TCAGCTTTTT TTAACACGAG GGAGGGGGGT 



unc-54 



4 91 TTAATAATAA TTCTATCCCA AAATCTACAC AATGTTCTGT GTACACTTCT TATGTTTTTT TTACTTCTGA 
AATTATTATT AAGATAGGGT TTTAGATGTG TTACAAGACA CATGTGAAGA ATACAAAAAA AATGAAGACT 

>unc-54 3* UTR 



561 TAAATTTTTT TTGAAACATC ATAGAAAAAA CCGCACACAA AATACCTTAT CATATGTTAC GTTTCAGTTT 
ATTTAAAAAA AACTTTGTAG TATCTTTTTT GGCGTGTGTT TTATGGAATA GTATACAATG CAAAGTCAAA 

unc-54 3* UTR 

631 ATGACCGCAA TTTTTATTTC TTCGCACGTC TGGGCCTCTC ATGACGTCAA ATCATGCTCA TCGTGAAAAA 
TACTGGCGTT AAAAATAAAG AAGCGTGCAG ACCCGGAGAG TACTGCAGTT TAGTACGAGT AGCACTTTTT 

unc-54 3' UTR 

701 GTTTTGGAGT ATTTTTGGAA TTTTTCAATC AAGTGAAAGT TTATGAAATT AATTTTCCTG CTTTTGCTTT 
CAAAACCTCA TAAAAACCTT AAAAAGTTAG TTCACTTTCA AATACTTTAA TTAAAAGGAC GAAAACGAAA 

unc-54 3* UTR 



771 TTGGGGGTTT CCCCTATTGT TTGTCAAGAG TTTCGAGGAC GGCGTTTTTC TTGCTAAAAT CACAAGTATT 
AACCCCCAAA GGGGATAACA AACAGTTCTC AAAGCTCCTG CCGCAAAAAG AACGATTTTA GTGTTCATAA 

unc-54 3' UTR 



841 GATGAGCACG ATGCAAGAAA GATCGGAAGA AGGTTTGGGT TTGAGGCTCA GTGGAAGGTG AGTAGAAGTT 
CTACTCGTGC TACGTTCTTT CTAGCCTTCT TCCAAACCCA AACTCCGAGT CACCTTCCAC TCATCTTCAA 

unc-54 3* UTR 

911 GATAATTTGA AAGTGGAGTA GTGTCTATGG GGrTTTTGCC TTAAATGACA GAATACATTC CCAATATACC 
CTATTAAACT TTCACCTCAT CACAGATACC CCAAAAACGG AATTTACTGT CTTATGTAAG GGTTATATGG 

unc-54 3* UTR 

981 AAACATAACT GTTTCCTACT AGTCGGCCGT ACGGGCCCTT TCGTCTCGCG CGTTTCGGTG ATGACGGTGA 
TTTGTATTGA CAAAGGATGA TCAGCCGGCA TGCCCGGGAA AGCAGAGCGC GCAAAGCCAC TACTGCCACT 
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1051 



AAACCTCTGA CACATGCAGC TCCCGGAGAC GGTCACAGCT TGTCTGTAAG CGGATGCCGG GAGCAGACAA 
TTTGGAGACT GTGTACGTCG AGGGCCTCTG CCAGTGTCGA ACAGACATTC GCCTACGGCC CTCGTCTGTT 



1121 



GCCCGTCAGG GCGCGTCAGC GGGTGTTGGC GGGTGTCGGG GCTGGCTTAA CTATGCGGCA TCAGAGCAGA 
CGGGCAGTCC CGCGCAGTCG CCCACAACCG CCCACAGCCC CGACCGAATT GATACGCCGT AGTCTCGTCT 



1191 



TTGTACTGAG AGTGCACCAT ATGCGGTGTG AAATACCGCA CAGATGCGTA AGGAGAAAAT ACCGCATCAG 
AACATGACTC TCACGTGGTA TACGCCACAC TTTATGGCGT GTCTACGCAT TCCTCTTTTA TGGCGTAGTC 



1261 



GCGGCCTTAA GGGCCTCGTG ATACGCCTAT TTTTATAGGT TAATGTCATG ATAATAATGG TTTCTTAGAC 
CGCCGGAATT CCCGGAGCAC TATGCGGATA AAAATATCCA ATTACAGTAC T ATT ATT AC C AAAGAATCTG 



1331 



GTCAGGTGGC ACTTTTCGGG GAAATGTGCG CGGAACCCCT ATTTGTTTAT TTTTCTAAAT ACATTCAAAT 
CAGTCCACCG TGAAAAGCCC CTTTACACGC GCCTTGGGGA TAAACAAATA AAAAGATTTA TGTAAGTTTA 



1401 



ATGTATCCGC TCATGAGACA ATAACCCTGA TAAATGCTTC AATAATATTG AAAAAGGAAG AGTATGAGTA 
TACATAGGCG AGTACTCTGT TATTGGGACT ATTTACGAAG TTATTATAAC TTTTTCCTTC TCATACTCAT 



1471 



amp 



TTCAACATTT CCGTGTCGCC CTTATTCCCT TTTTTGCGGC ATTTTGCCTT CCTGTTTTTG CTCACCCAGA 
AAGTTGTAAA GGCACAGCGG GAATAAGGGA AAAAACGCCG TAAAACGGAA GGACAAAAAC GAGTGGGTCT 



1541 



amp 



AACGCTGGTG AAAGTAAAAG ATGCTGAAGA TCAGTTGGGT GCACGAGTGG GTTACATCGA ACTGGATCTC 
TTGCGACCAC TTTCATTTTC TACGACTTCT AGTCAACCCA CGTGCTCACC CAATGTAGCT TGACCTAGAG 



1611 



amp 



AACAGCGGTA AGATCCTTGA GAGTTTTCGC CCCGAAGAAC GTTTTCCAAT GATGAGCACT TTTAAAGTTC 
TTGTCGCCAT TCTAGGAACT CTCAAAAGCG GGGCTTCTTG CAAAAGGTTA CTACTCGTGA AAATTTCAAG 



1681 



amp 



TGCTATGTGG CGCGGTATTA TCCCGTATTG ACGCCGGGCA AGAGCAACTC GGTCGCCGCA TACACTATTC 
ACGATACACC GCGCCATAAT AGGGCATAAC TGCGGCCCGT TCTCGTTGAG CCAGCGGCGT ATGTGATAAG 



1751 



amp 



TCAGAATGAC TTGGTTGAGT ACTCACCAGT CACAGAAAAG CATCTTACGG ATGGCATGAC AGTAAGAGAA 
AGTCTTACTG AACCAACTCA TGAGTGGTCA GTGTCTTTTC GTAGAATGCC TACCGTACTG TCATTCTCTT 



amp 

1821 TTATGCAGTG CTGCCATAAC CATGAGTGAT AACACTGCGG CCAACTTACT TCTGACAACG ATCGGAGGAC 
AATACGTCAC GACGGTATTG GTACTCACTA TTGTGACGCC GGTTGAATGA AGACTGTTGC TAGCCTCCTG 



1891 



amp 



CGAAGGAGCT AACCGCTTTT TTGCACAACA TGGGGGATCA TGTAACTCGC CTTGATCGTT GGGAACCGGA 
GCTTCCTCGA TTGGCGAAAA AACGTGTTGT ACCCCCTAGT ACATTGAGCG GAACTAGCAA CCCTTGGCCT 



amp 

1961 GCTGAATGAA GCCATACCAA ACGACGAGCG TGACACCACG ATGCCTGTAG CAATGGCAAC AACGTTGCGC 
CGACTTACTT CGGTATGGTT TGCTGCTCGC ACTGTGGTGC. TACGGACATC GTTACCGTTG TTGCAACGCG 
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=^35ggaaaaaccaas=aa=aaaa33gas e= ==zga3 

2031 AAACTATTAA CTGGCGAACT ACTTACTCTA GCTTCCCGGC AACAATTAAT AGACTGGATG GAGGCGGATA 
TTTGATAATT GACCGCTTGA TGAATGAGAT CGAAGGGCCG TTGTTAATTA TCTGACCTAC CTCCGCCTAT 



amp 



2101 AAGTTGCAGG ACCACTTCTG CGCTCGGCCC TTCCGGCTGG CTGGTTTATT GCTGATAAAT CTGGAGCCGG 
TTCAACGTCC TGGTGAAGAC GCGAGCCGGG AAGGCCGACC GACCAAATAA CGACTATTTA GACCTCGGCC 



2171 TGAGCGTGGG TCTCGCGGTA TCATTGCAGC ACTGGGGCCA GATGGTAAGC CCTCCCGTAT CGTAGTTATC 
ACTCGCACCC AGAGCGCCAT AGTAACGTCG TGACCCCGGT CTACCATTCG GGAGGGCATA GCATCAATAG 

amp 

2241 TACACGACGG GGAGTCAGGC AACTATGGAT GAACGAAATA GACAGATCGC TGAGATAGGT GCCTCACTGA 
ATGTGCTGCC CCTCAGTCCG TTGATACCTA CTTGCTTTAT CTGTCTAGCG ACTCTATCCA CGGAGTGACT 

amp 

2311 TTAAGCATTG GTAACTGTCA GACCAAGTTT ACTCATATAT ACTTTAGATT GATTTAAAAC TTCATTTTTA 
AATTCGTAAC CATTGACAGT CTGGTTCAAA TGAGTATATA TGAAATCTAA CTAAATTTTG AAGTAAAAAT 

2381 ATTTAAAAGG ATCTAGGTGA AGATCCTTTT TGATAATCTC ATGACCAAAA TCCCTTAACG TGAGTTTTCG 
TAAATTTTCC TAGATCCACT TCTAGGAAAA ACTATTAGAG TACTGGTTTT AGGGAATTGC ACTCAAAAGC 

24 51 TTCCACTGAG CGTCAGACCC CGTAGAAAAG ATCAAAGGAT CTTCTTGAGA TCCTTTTTTT CTGCGCGTAA 
AAGGTGACTC GCAGTCTGGG GCATCTTTTC TAGTTTCCTA GAAGAACTCT AGGAAAAAAA GACGCGCATT 

2521 TCTGCTGCTT GCAAACAAAA AAACCACCGC TACCAGCGGT GGTTTGTTTG CCGGATCAAG AGCTACCAAC 
AGACGACGAA CGTTTGTTTT TTTGGTGGCG ATGGTCGCCA CCAAACAAAC GGCCTAGTTC TCGATGGTTG 

2591 TCTTTTTCCG AAGGTAACTG GCTTCAGCAG AGCGCAGATA CCAAATACTG TCCTTCTAGT GTAGCCGTAG 
AGAAAAAGGC TTCCATTGAC CGAAGTCGTC TCGCGTCTAT GGTTTATGAC AGGAAGATCA CATCGGCATC 

2661 TTAGGCCACC ACTTCAAGAA CTCTGTAGCA CCGCCTACAT ACCTCGCTCT GCTAATCCTG TTACCAGTGG 
AATCCGGTGG TGAAGTTCTT GAGACATCGT GGCGGATGTA TGGAGCGAGA CGATTAGGAC AATGGTCACC 

27 31 CTGCTGCCAG TGGCGATAAG TCGTGTCTTA CCGGGTTGGA CTCAAGACGA TAGTTACCGG ATAAGGCGCA 

GACGACGGTC ftCCGCTATTC AGCACAGAAT GGCCCAACCT GAGTTCTGCT ATCAATGGCC TATTCCGCGT 

28 01 GCGGTCGGGC TGAACGGGGG GTTCGTGCAC ACAGCCCAGC TTGGAGCGAA CGACCTACAC CGAACTGAGA 

CGCCAGCCCG ACTTGCCCCC CAAGCACGTG TGTCGGGTCG AACCTCGCTT GCTGGATGTG GCTTGACTCT 

2871 TACCTACAGC GTGAGCATTG AGAAAGCGCC ACGCTTCCCG AAGGGAGAAA GGCGGACAGG TATCCGGTAA 
ATGGATGTCG CACTCGTAAC TCTTTCGCGG TGCGAAGGGC TTCCCTCTTT CCGCCTGTCC ATAGGCCATT 

2941 GCGGCAGGGT CGGAACAGGA GAGCGCACGA GGGAGCTTCC AGGGGGAAAC GCCTGGTATC TTTATAGTCC 
CGCCGTCCCA GCCTTGTCCT CTCGCGTGCT CCCTCGAAGG TCCCCCTTTG CGGACCATAG AAATATCAGG 

3011 TGTCGGGTTT CGCCACCTCT GACTTGAGCG TCGATTTTTG TGATGCTCGT CAGGGGGGCG GAGCCTATGG 
ACAGCCCAAA GCGGTGGAGA CTGAACTCGC AGCTAAAAAC ACTACGAGCA GTCCCCCCGC CTCGGATACC 

3081 AAAAACGCCA GCAACGCGGC CTTTTTACGG TTCCTGGCCT TTTGCTGGCC TTTTGCTCAC ATGTTCTTTC 
TTTTTGCGGT CGTTGCGCCG GAAAAATGCC AAGGACCGGA AAACGACCGG AAAACGAGTG TACAAGAAAG 

3151 CTGCGTTATC CCCTGATTCT GTGGATAACC GTATTACCGC CTTTGAGTGA GCTGATACCG CTCGCCGCAG 
GACGCAATAG GGGACTAAGA CACCTATTGG CATAATGGCG GAAACTCACT CGACTATGGC GAGCGGCGTC 

32 21 CCGAACGACC GAGCGCAGCG AGTCAGTGAG CGAGGAAGCG GAAGAGCGCC CAATACGCAA ACCGCCTCTC 
GGCTTGCTGG CTCGCGTCGC TCAGTCACTC GCTCCTTCGC CTTCTCGCGG GTTATGCGTT TGGCGGAGAG 

32 91 CCCGCGCGTT GGCCGATTCA TTAATGCAGC TGGCACGACA GGTTTCCCGA CTGGAAAGCG GGCAGTGAGC 

GGGCGCGCAA CCGGCTAAGT AATTACGTCG ACCGTGCTGT CCAAAGGGCT GACCTTTCGC CCGTCACTCG 

33 61 GCAACGCAAT TAATGTGAGT TAGCTCACTC ATTAGGCACC CCAGGCTTTA CACTTTATGC TTCCGGCTCG 

CGTTGCGTTA ATTACACTCA ATCGAGTGAG TAATCCGTGG GGTCCGAAAT GTGAAATACG AAGGCCGAGC. 

34 31 TATGTTGTGT GGAATTGTGA GCGGATAACA ATTTCACACA GGAAACAGCT ATGACCATGA TTACGCCA 

ATACAACACA CCTTAACACT CGCCTATTGT TAAAGTGTGT CCTTTGTCGA TACTGGTACT AATGCGGT 
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T7 promoter Outron 

1 AGCTTGGCGC CTAATACGAC TCACTATAGG GCTGCAGGTC GACTCTAGAT TACAACTAAT TATACTTATT 
TCGAACCGCG GATTATGCTG AGTGATATCC CGACGTCCAG CTGAGATCTA ATGTTGATTA ATATGAATAA 

Outron synth. intron A 

71 TGAATATTCA AATTTTCAGA CCCGGGATTG GCCAAAGGAC CCAAAGGTAT GTTTCGAATG ATACTAACAT 
ACTTATAAGT TTAAAAGTCT GGGCCCTAAC CGGTTTCCTG GGTTTCCATA CAAAGCTTAC TATGATTGTA 

synth. intron A GFP with introns 

141 AACATAGAAC ATTTTCAGGA GGACCCTTGG CTAGCGTCGA CGGTACCATG GGGCGCGCCA TGAGTAAAGG 
TTGTATCTTG TAAAAGTCCT CCTGGGAACC GATCGCAGCT GCCATGGTAC CCCGCGCGGT ACTCATTTCC 

GFP with introns 

211 AGAAGAACTT TTCACTGGAG TTGTCCCAAT TCTTGTTGAA TTAGATGGTG ATGTTAATGG GCACAAATTT 
TCTTCTTGAA AAGTGACCTC AACAGGGTTA AGAACAACTT AATCTACCAC TACAATTACC CGTGTTTAAA 

GFP with introos 



281 TCTGTCAGTG GAGAGGGTGA AGGTGATGCA ACATACGGAA AACTTACCCT TAAATTTATT TGCACTACTG 
AGACAGTCAC CTCTCCCACT TCCACTACGT TGTATGCCTT TTGAATGGGA ATTTAAATAA ACGTGATGAC 



GFP with introns 



351 GAAAACTACC TGTTCCATGG GTAAGTTTAA ACATATATAT ACTAACTAAC CCTGATTATT TAAATTTTCA 
CTTTTGATGG ACAAGGTACC CATTCAAATT TGTATATATA TGATTGATTG GGACTAATAA ATTTAAAAGT 



GFP with introns 



421 GCCAACACTT GTCACTACTT TCTGTTATGG TGTTCAATGC TTCTCGAGAT ACCCAGATCA TATGAAACGG 
CGGTTGTGAA CAGTGATGAA AGACAATACC ACAAGTTACG AAGAGCTCTA TGGGTCTAGT ATACTTT GCC 

GFP with introns 

491 CATGACTTTT TCAAGAGTGC CATGCCCGAA GGTTATGTAC AGGAAAGAAC TATATTTTTC AAAGATGACG 
GTACTGAAAA AGTTCTCACG GTACGGGCTT CCAATACATG TCCTTTCTTG ATATAAAAAG TTTCTACTGC 



I GFP with introns 

561 GGAACTACAA GACACGTAAG TTTAAACAGT TCGGTACTAA CTAACCATAC ATATTTAAAT TTTCAGGTGC 
CCTTGATGTT CTGTGCATTC AAATTTGTCA AGCCATGATT GATTGGTATG TATAAATTTA AAAGTCCACG 

GFP with introns 

631 TGAAGTCAAG TTTGAAGGTG ATACCCTTGT TAATAGAATC GAGTTAAAAG GTATTGATTT T AAAGAAG AT 
ACTTCAGTTC AAACTTCCAC TATGGGAACA ATTATCTTAG CTCAATTTTC CATAACTAAA ATTTCTTCTA 

GFP with introns 



701 GGAAACATTC TTGGACACAA ATTGGAATAC AACTATAACT CACACAATGT ATACATCATG GCAGACAAAC 
CCTTTGTAAG AACCTGTGTT TAACCTTATG TTGATATTGA GTGTGTTACA TATGTAGTAC CGTCTGTTTG 

GFP with introns 

771 AAAAGAATGG AATCAAAGTT GTAAGTTTAA ACTTGGACTT ACTAACTAAC GGATTATATT TAAATTTTCA 
TTTTCTTACC TTAGTTTCAA CATTCAAATT TGAACCTGAA TGATTGATTG CCTAATATAA ATTTAAAAGT 

GFP with introns 



841 GAACTTCAAA ATTAGACACA ACATTGAAGA TGGAAGCGTT CAACTAGCAG ACCATTATCA ACAAAATACT 
CTTGAAGTTT TAATCTGTGT TGTAACTTCT ACCTTCGCAA GTTGATCGTC TGGTAATAGT TGTTTTATGA 

GFP with introns 

911 CCAATTGGCG ATGGCCCTGT CCTTTTACCA GACAACCATT ACCTGTCCAC ACAATCTGCC CTTTCGAAAG 
GGTTAACCGC TACCGGGACA GGAAAATGGT CTGTTGGTAA TGGACAGGTG TGTTAGACGG GAAAGCTTTC 

GFP with introns 

981 ATCCCAACGA AAAGAGAGAC CACATGGTCC TTCTTGAGTT TGTAACAGCT GCTGGGATTA CACATGGCAT 
TAGGGTTGCT TTTCTCTCTG GTGTACCAGG AAGAACTCAA ACATTGTCGA CGACCCTAAT GTGTACCGTA 
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GFP with introns unc-54 3' UTR 

1051 GGATGAACTA TACAAATAGG GCCGGCCGAG CTCCGCATCG GCCGCTGTCA TCAGATCGCC ATCTCGCGCC 
. CCTACTTGAT ATGTTTATCC CGGCCGGCTC GAGGCGTAGC CGGCGACAGT AGTCTAGCGG TAGAGCGCGG 

unc-54 3* UTR 

1121 CGTGCCTCTG ACTTCTAAGT CCAATTACTC TTCAACATCC CTACATGCTC TTTCTCCCTG TGCTCCCACC 
GCACGGAGAC TGAAGATTCA GGTTAATGAG AAGTTGTAGG GATGTACGAG AAAGAGGGAC ACGAGGGTGG 

unc-54 3' UTR 

1191 CCCTATTTTT GTTATTATCA AAAAAACTTC TTCTTAATTT CTTTGTTTTT TAGCTTCTTT TAAGTCACCT 
GGGATAAAAA CAATAATAGT TTTTTTGAAG AAGAATTAAA GAAACAAAAA ATCGAAGAAA ATTCAGTGGA 

unc-54 3' UTR 



1261 CTAACAATGA AATTGTGTAG ATTCAAAAAT AGAATTAATT CGTAATAAAA AGTCGAAAAA AATTGTGCTC 
GATTGTTACT TTAACACATC TAAGTTTTTA TCTTAATTAA GCATTATTTT TCAGCTTTTT TTAACACGAG 

unc-54 3» UTR 



1331 CCTCCCCCCA TTAATAATAA TTCTATCCCA AAATCTACAC AATGTTCTGT GTACACTTCT TATGTTTTTT 
GGAGGGGGGT AATTATTATT AAGATAGGGT TTTAGATGTG TTACAAGACA CATGTGAAGA ATACAAAAAA 

unc-54 3 ' UTR 



1401 TTACTTCTGA TAAATTTTTT TTGAAACATC ATAGAAAAAA CCGCACACAA AATACCTTAT CATATGTTAC 
AATGAAGACT ATTTAAAAAA AACTTTGTAG TATCTTTTTT GGCGTGTGTT TTATGGAATA GTATACAATG 

unc-54 3 1 UTR 



1471 GTTTCAGTTT ATGACCGCAA TTTTTATTTC TTCGCACGTC TGGGCCTCTC ATGACGTCAA ATCATGCTCA 
CAAAGTCAAA TACTGGCGTT AAAAATAAAG AAGCGTGCAG ACCCGGAGAG TACTGCAGTT TAGTACGAGT 

unc-54 3' UTR 

1541 TCGTGAAAAA GTTTTGGAGT ATTTTTGGAA TTTTTCAATC AAGTGAAAGT TTATGAAATT AATTTTCCTG 
AGCACTTTTT CAAAACCTCA TAAAAACCTT AAAAAGTTAG TTCACTTTCA AATACTTTAA TTAAAAGGAC 

unc-54 3' OTR 



1611 CTTTTGCTTT TTGGGGGTTT CCCCTATTGT TTGTCAAGAG TTTCGAGGAC GGCGTTTTTC TTGCTAAAAT 
GAAAACGAAA AACCCCCAAA GGGGATAACA AACAGTTCTC AAAGCTCCTG CCGCAAAAAG AACGATTTTA 

unc-54 3' OTR 

1681 CACAAGTATT GATGAGCACG ATGCAAGAAA GATCGGAAGA AGGTTTGGGT TTGAGGCTCA GTGGAAGGTG 
GTGTTCATAA CTACTCGTGC TACGTTCTTT CTAGCCTTCT TCCAAACCCA AACTCCGAGT CACCTTCCAC 

unc-54 3* OTR 



1751 AGTAGAAGTT GATAATTTGA AAGTGGAGTA GTGTCTATGG GGTTTTTGCC TTAAATGACA GAATACATTC 
TCATCTTCAA CTATTAAACT TTCACCTCAT CACAGATACC CCAAAAACGG AATTTACTG7 CTTATGTAAG 

unc-54 3* OTR 



1821 CCAATATACC AAACATAACT GTTTCCTACT AGTCGGCCGT ACGGGCCCTT TCGTCTCGCG CGTTTCGGTG 
GGTTATATGG TTTGTATTGA CAAAGGATGA TCAGCCGGCA TGCCCGGGAA AGCAGAGCGC GCAAAGCCAC 

1B91 ATGACGGTGA AAACCTCTGA CACATGCAGC TCCCGGAGAC GGTCACAGCT TGTCTGTAAG CGGATGCCGG 
TACTGCCACT TTTGGAGACT GTGTACGTCG AGGGCCTCTG CCAGTGTCGA ACAGACATTC GCCTACGGCC 

1961 GAGCAGACAA GCCCGTCAGG GCGCGTCAGC GGGTGTTGGC GGGTGTCGGG GCTGGCTTAA CTATGCGGCA 
CTCGTCTGTT CGGGCAGTCC CGCGCAGTCG CCCACAACCG CCCACAGCCC CGACCGAATT GATACGCCGT 

2031 TCAGAGCAGA TTGTACTGAG AGTGCACCAT ATGCGGTGTG AAATACCGCA CAGATGCGTA AGGAGAAAAT 
AGTCTCGTCT AACATGACTC TCACGTGGTA TACGCCACAC TTTATGGCGT GTCTACGCAT TCCTCTTTTA 

2101 ACCGCATCAG GCGGCCTTAA GGGCCTCGTG ATACGCCTAT TTTTATAGGT TAATGTCATG ATAATAATGG 
TGGCGTAGTC CGCCGGAATT CCCGGAGCAC TATGCGGATA AAAATATCCA ATTACAGTAC TATTATTACC 
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2X1\ TTTCTTAGAC GTCAGGTGGC ACTTTTCGGG GAAATGTGCG CGGAACCCCT ATTTGTTTAT TTTTCTAAAT 
AAAGAATCTG CAGTCCACCG TGAAAAGCCC CTTTACACGC GCCTTGGGGA TAAACAAATA AAAAGATTTA 

2241 ACATTCAAAT ATGTATCCGC TCATGAGACA ATAACCCTGA TAAATGCTTC AATAATATTG AAAAAGGAAG 
TGTAAGTTTA TACATAGGCG AGTACTCTGT TATTGGGACT ATTTACGAAG TTATTATAAC TTTTTCCTTC 



2311 



amp 



AGTATGAGTA TTCAACATTT CCGTGTCGCC CT TATTCCCT TTTTTGCGGC ATTTTGCCTT CCTGTTTTTG 
TCATACTCAT AAGTTGTAAA GGCACAGCGG GAATAAGGGA AAAAACGCCG TAAAACGGAA GGACAAAAAC 



2381 



amp 



CTCACCCAGA AACGCTGGTG AAAGTAAAAG ATGCTGAAGA TCAGTTGGGT GCACGAGTGG GTTACATCGA 
GAGTGGGTCT T TGCGACC AC TTTCATTTTC TACGACTTCT AGTCAACCCA CGTGCTCACC CAATGTAGCT 



2451 



amp 



ACTGGATCTC AACAGCGGTA AGATCCTTGA GAGTTTTCGC CCCGAAGAAC GTTTTCCAAT GATGAGCACT 
TGACCTAGAG TTGTCGCCAT TCTAGGAACT CTCAAAAGCG GGGCTTCTTG CAAAAGGTTA CTACTCGTGA 



amp 



2521 



TTTAAAGTTC TGCTATGTGG CGCGGTATTA TCCCGTATTG ACGCCGGGCA AGAGCAACTC GGTCGCCGCA 
AAATTTCAAG ACGATACACC GCGCCATAAT AGGGCATAAC TGCGGCCCGT TCTCGTTGAG CCAGCGGCGT 



2591 



amp 



TACACTATTC TCAGAATGAC TTGGTTGAGT ACTCACCAGT CACAGAAAAG CATCTTACGG ATGGCATGAC 
ATGTGATAAG AGTCTTACTG AACCAACTCA TGAGTGGTCA GTGTCTTTTC GTAGAATGCC TACCGTACTG 



amp 

2661 AGTAAGAGAA TTATGCAGTG CTGCCATAAC CATGAGTGAT AACACTGCGG CCAACTTACT TCTGACAACG 
TCATTCTCTT AATACGTCAC GACGGTATTG GTACTCACTA TTGTGACGCC GGTTGAATGA AGACTGTTGC 



2731 



amp 



ATCGGAGGAC CGAAGGAGCT AACCGCTTTT TTGCACAACA TGGGGGATCA TGTAACTCGC CTTGATCGTT 
TAGCCTCCTG GCTTCCTCGA TTGGCGAAAA AACGTGTTGT ACCCCCTAGT ACATTGAGCG GAACTAGCAA 



2801 



amp 



GGGAACCGGA GCTGAATGAA GCCATACCAA ACGACGAGCG TGACACCACG ATGCCTGTAG CAATGGCAAC 
CCCTTGGCCT CGACTTACTT CGGTATGGTT TGCTGCTCGC ACTGTGGTGC TACGGACATC GTTACCGTTG 



2871 



amp 



AACGTTGCGC AAACTATTAA CTGGCGAACT ACTTACTCTA GCTTCCCGGC AACAATTAAT AGACTGGATG 
TTGCAACGCG TTTGATAATT GACCGCTTGA TGAATGAGAT CGAAGGGCCG TTGTTAATTA TCTGACCTAC 



2941 



amp 



GAGGCGGATA AAGTTGCAGG ACCACTTCTG CGCTCGGCCC TTCCGGCTGG CTGGTTTATT GCTGATAAAT 
CTCCGCCTAT TTCAACGTCC TGGTGAAGAC GCGAGCCGGG AAGGCCGACC GACCAAATAA CGACTATTTA 



amp 

3011 CTGGAGCCGG TGAGCGTGGG TCTCGCGGTA TCATTGCAGC ACTGGGGCCA GATGGTAAGC CCTCCCGTAT 
GACCTCGGCC ACTCGCACCC AGAGCGCCAT AGTAACGTCG TGACCCCGGT CTACCATTCG GGAGGGCATA 



amp 

3081 CGTAGTTATC TACACGACGG GGAGTCAGGC AACTATGGAT GAACGAAATA GACAGATCGC TGAGATAGGT 
GCATCAATAG ATGTGCTGCC CCTCAGTCCG TTGATACCTA CTTGCTTTAT CTGTCTAGCG ACTCTATCCA 
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3151 
3221 
3291 
3361 
3431 
3501 

3571 
3641 
3711 
3781 
3851 
3921 
3991 
4061 
4131 
4201 
4271 
4341 



amp 

'gcctcactga ttaagcattg gtaactgtca gaccaagttt actcatatat actttagatt gatttaaaac 

CGGAGTGACT AATTCGTAAC CATTGACAGT CTGGTTCAAA TGAGTATATA TGAAATCTAA CTAAATTTTG 

TTCATTTTTA ATTTAAAAGG ATCTAGGTGA AGATCCTTTT TGATAATCTC ATGACCAAAA TCCCTTAACG 
AAGTAAAAAT TAAATTTTCC TAGATCCACT TCTAGGAAAA ACTATTAG7AG TACTGGTTTT AGGGAATTGC 

TGAGTTTTCG TTCCACTGAG CGTCAGACCC CGTAGAAAAG ATCAAAGGAT CTTCTTGAGA TCCTTTTTTT 
ACTCAAAAGC AAGGTGACTC GCAGTCTGGG GCATCTTTTC TAGTTTCCTA GAAGAACTCT AGGAAAAAAA 

CTGCGCGTAA TCTGCTGCTT GCAAACAAAA AAACCACCGC TACCAGCGGT GGTTTGTTTG CCGGATCAAG 
GACGCGCATT AGACGACGAA CGTTTGTTTT TTTGGTGGCG ATGGTCGCCA CCAAACAAAC GGCCTAGTTC 

AGCTACCAAC TCTTTTTCCG AAGGTAACTG GCTTCAGCAG AGCGCAGATA CCAAATACTG TCCTTCTAGT 
TCGATGGTTG AGAAAAAGGC TTCCATTGAC CGAAGTCGTC TCGCGTCTAT GGTTTATGAC AGGAAGATCA 

GTAGCCGTAG TTAGGCCACC ACTTCAAGAA CTCTGTAGCA CCGCCTACAT ACCTCGCTCT GCTAATCCTG 
CATCGGCATC AATCCGGTGG TGAAGTTCTT GAGACATCGT GGCGGATGTA TGGAGCGAGA CGATTAGGAC 



TTACCAGTGG CTGCTGCCAG TGGCGATAAG TCGTGTCTTA CCGGGTTGGA CTCAAGACGA TAGTTACCGG 
AATGGTCACC GACGACGGTC ACCGCTATTC AGCACAGAAT GGCCCAACCT GAGTTCTGCT ATCAATGGCC 

ATAAGGCGCA GCGGTCGGGC TGAACGGGGG GTTCGTGCAC ACAGCCCAGC TTGGAGCGAA CGACCTACAC 
TATTCCGCGT CGCCAGCCCG ACTTGCCCCC CAAGCACGTG TGTCGGGTCG AACCTCGCTT GCTGGATGTG 

CGAACTGAGA TACCTACAGC GTGAGCATTG AGAAAGCGCC ACGCTTCCCG AAGGGAGAAA GGCGGACAGG 
GCTTGACTCT ATGGATGTCG CACTCGTAAC TCTTTCGCGG TGCGAAGGGC TTCCCTCTTT CCGCCTGTCC 

TATCCGGTAA GCGGCAGGGT CGGAACAGGA GAGCGCACGA GGGAGCTTCC AGGGGGAAAC GCCTGGTATC 
ATAGGCCATT CGCCGTCCCA GCCTTGTCCT CTCGCGTGCT CCCTCGAAGG TCCCCCTTTG CGGACCATAG 

TTTATAGTCC TGTCGGGTTT CGCCACCTCT GACTTGAGCG TCGATTTTTG TGATGCTCGT CAGGGGGGCG 
AAATATCAGG ACAGCCCAAA GCGGTGGAGA CTGAACTCGC AGCTAAAAAC ACTACGAGCA GTCCCCCCGC 

GAGCCTATGG AAAAACGCCA GCAACGCGGC CTTTTTACGG TTCCTGGCCT TTTGCTGGCC TTTTGCTCAC 
CTCGGATACC TTTTTGCGGT CGTTGCGCCG GAAAAATGCC AAGGACCGGA AAACGACCGG AAAACGAGTG 

ATGTTCTTTC CTGCGTTATC CCCTGATTCT GTGGATAACC GTATTACCGC CTTTGAGTGA GCTGATACCG 
TACAAGAAAG GACGCAATAG GGGACTAAGA CACCTATTGG CATAATGGCG GAAACTCACT CGACTATGGC 

CTCGCCGCAG CCGAACGACC GAGCGCAGCG AGTCAGTGAG CGAGGAAGCG GAAGAGCGCC CAATACGCAA 
GAGCGGCGTC GGCTTGCTGG CTCGCGTCGC TCAGTCACTC GCTCCTTCGC CTTCTCGCGG GTTATGCGTT 

ACCGCCTCTC CCCGCGCGTT GGCCGATTCA TTAATGCAGC TGGCACGACA GGTTTCCCGA CTGGAAAGCG 
TGGCGGAGAG GGGCGCGCAA CCGGCTAAGT AATTACGTCG ACCGTGCTGT CCAAAGGGCT GACCTTTCGC 

GGCAGTGAGC GCAACGCAAT TAATGTGAGT TAGCTCACTC ATTAGGCACC CCAGGCTTTA CACTTTATGC 
CCGTCACTCG CGTTGCGTTA ATTACACTCA ATCGAGTGAG TAATCCGTGG GGTCCGAAAT GTGAAATACG 

TTCCGGCTCG TATGTTGTGT GGAATTGTGA GCGGATAACA ATTTCACACA GGAAACAGCT ATGACCATGA 
AAGGCCGAGC ATACAACACA CCTTAACACT CGCCTATTGT TAAAGTGTGT CCTTTGTCGA TACTGGTACT 

TTACGCCA 
AATGCGGT 
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SEQUENCE LISTING 

<U0> DEVGEN NV 

<120> GENE EXPRESSION SYSTEM 

<130> SCB/65177/001 

<140> 
<141> 

<160> 5 

<170> Patent In Ver. 2.0 

<210> 1 
<211> 47 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
oligonucleotide o-GN59 

; <4 00> 1 

' ctagattaca actaattata cttatttgaa tattcaaatt ttcagac 47 

<210> 2 
<211> 47 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence; 
oligonucleotide O-GN60 

<400> 2 

ccgggtctga aaatttgaat attcaaataa gtataattag ttgtaat 47 

<210> 3 
<211> 3498 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: plasmid 
pDW3123 

<400> 3 

agcttggcgc ctaatacgac tcactatagg gctgcaggtc gactctagat tacaactaat 60 

tatacttatt tgaatattca aattttcaga cccgggattg gccaaaggac ccaaaggtat 120 

gtttcgaatg atactaacat aacatagaac attttcagga ggacccttgg ctagcgtcct 180 

gctgggatta cacatggcat ggatgaacta tacaaatagg gccggccgag ctccgcatcg 240 

gccgctgtca tcagatcgcc atctcgcgcc cgtgcctctg acttctaagt ccaattactc 300 

ttcaacatcc ctacatgctc tttctccctg tgctcccacc ccctattttt gttattatca 360 

aaaaaacttc ttcttaattt ctttgttttt tagcttcttt taagtcacct ctaacaatga 420- 

aattgtgtag attcaaaaat agaattaatt cgtaataaaa agtcgaaaaa aattgtgctc 480 

cctcccccca ttaataataa ttctatccca aaatctacac aatgttctgt gtacacttct 54 0 

tatgtttttt ttacttctga taaatttttt ttgaaacatc ata'gaaaaaa ccgcacacaa 600 

aataccttat catatgttac gtttcagttt atgaccgcaa tttttatttc ttcgcacgtc 660 

tgggcctctc atgacgtcaa atcatgctca tcgtgaaaaa gttttggagt atttttggaa 720 

I 
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tttttcaatc aagtgaaagt ttatgaaatt 
cccctattgt ttgtcaagag tttcgaggac 
gatgagcacg atgcaagaaa gatcggaaga 
agtagaagtt gataatttga aagtggagta 
gaatacattc ccaatatacc aaacataact 
tcgtctcgcg. cgtttcggtg atgacggtga 
ggtcacagct tgtctgtaag cggatgccgg 
gggtgttggc gggtgtcggg gctggcttaa 
agtgcaccat atgcggtgtg aaataccgca 
gcggccttaa gggcctcgtg atacgcctat 
tttcttagac gtcaggtggc acttttcggg 
ttttctaaat acattcaaat atgtatccgc 
aataatattg aaaaaggaag agtatgagta 
tttttgcggc attttgcctt cctgtttttg 
atgctgaaga tcagttgggt gcacgagtgg 
agatccttga gagttttcgc cccgaagaac 
tgctatgtgg cgcggtatta tcccgtattg 
tacactattc tcagaatgac ttggttgagt 
atggcat'gac agtaagagaa ttatgcagtg 
ccaacttact tctgacaacg atcggaggac 
tgggggatca tgtaactcgc cttgatcgtt 
acgacgagcg tgacaccacg atgcctgtag 
ctggcgaact acttactcta gcttcccggc 
aagttgcagg accacttctg cgctcggccc 
ctggagccgg tgagcgtggg tctcgcggta 
cctcccgtat cgtagttatc tacacgacgg 
gacagatcgc tgagataggt gcctcactga 
actcatatat actttagatt gatttaaaac 
agatcctttt tgataatctc atgaccaaaa 
cgtcagaccc cgtagaaaag atcaaaggat 
tctgctgctt gcaaacaaaa aaaccaccgc 
agctaccaac tctttttccg aaggtaactg 
tccttctagt gtagccgtag ttaggccacc 
acctcgctct gctaatcctg ttaccagtgg 
ccgggttgga ctcaagacga tagttaccgg 
gttcgtgcac acagcccagc ttggagcgaa 
gtgagcattg agaaagcgcc acgcttcccg 
gcggcagggt cggaacagga gagcgcacga 
tttatagtcc tgtcgggttt cgccacctct 
caggggggcg gagcctatgg aaaaacgcca 
tttgctggcc ttttgctcac atgttctttc 
gtattaccgc ctttgagtga gctgataccg 
agtcagtgag cgaggaagcg gaagagcgcc 
ggccgattca ttaatgcagc tggcacgaca 
gcaacgcaat taatgtgagt tagctcactc 
ttccggctcg tatgttgtgt ggaattgtga 
atgaccatga ttacgcca 

<210> 4 
<211> 4348 
<212> DNA 

<213> Artificial Sequence 



aattttcctg cttttgcttt ttgggggttt 780 
ggcgtttttc ttgctaaaat cacaagtatt 840 
aggtttgggt ttgaggctca gtggaaggtg 900 
gtgtctatgg ggtttttgcc ttaaatgaca 960 
gtttcctact agtcggccgt acgggccctt 1020 
aaacctctga cacatgcagc tcccggagac 1080 
gagcagacaa gcccgtcagg gcgcgtcagc 1140 
ctatgcggca tcagagcaga ttgtactgag 1200 
cagatgcgta aggagaaaat accgcatcag 1260 
ttttataggt taatgtcatg ataataatgg 1320 
gaaatgtgcg cggaacccct atttgtttat 1380 
tcatgagaca ataaccctga taaatgcttc 1440 
ttcaacattt ccgtgtcgcc cttattccct 1500 
ctcacccaga aacgctggtg aaagtaaaag 1560 
gttacatcga actggatctc aacagcggta 1620 
gttttccaat gatgagcact tttaaagttc 1680 
acgccgggca agagcaactc ggtcgccgca 174 0 
actcaccagt cacagaaaag catcttacgg 1800 
ctgccataac catgagtgat aacactgcgg 1860 
cgaaggagct aaccgctttt ttgcacaaca 1920 
gggaaccgga gctgaatgaa gccataccaa 1980 
caatggcaac aacgttgcgc aaactattaa 204O 
aacaattaat agactggatg gaggcggata 2100 
ttccggctgg ctggtttatt gctgataaat 2160 
tcattgcagc actggggcca gatggtaagc 2220 
ggagtcaggc aactatggat gaacgaaata 2280 
ttaagcattg gtaactgtca gaccaagttt 234 0 
ttcattttta atttaaaagg atctaggtga 2400 
tcccttaacg tgagttttcg ttccactgag 2460 
cttcttgaga tccttttttt ctgcgcgtaa 2520 
taccagcggt ggtttgtttg ccggatcaag 2580 
gcttcagcag agcgcagata ccaaatactg 2640 
acttcaagaa ctctgtagca ccgcctacat 2700 
ctgctgccag tggcgataag tcgtgtctta 2760 
ataaggcgca gcggtcgggc tgaacggggg 2820 
cgacctacac cgaactgaga tacctacagc 2880 
aagggagaaa ggcggacagg tatccggtaa 2940 
gggagcttcc agggggaaac gcctggtatc 3000 
gacttgagcg tcgatttttg tgatgctcgt 3060 
gcaacgcggc ctttttacgg ttcctggcct 3120 
ctgcgttatc ccctgattct gtggataacc 3180 
ctcgccgcag ccgaacgacc gagcgcagcg 324 0 
caatacgcaa accgcctctc cccgcgcgtt 3300 
ggtttcccga ctggaaagcg ggcagtgagc 3360 
attaggcacc ccaggcttta cactttatgc 3420 
gcggataaca atttcacaca ggaaacagct 3480 

34 98 



<220> 

<223> Description of Artificial Sequence: plasmid 
PDW3124 



<400> 4 

agcttggcgc ctaatacgac tcactatagg 

tatacttatt tgaatattca aattttcaga 

gtttcgaatg atactaacat aacatagaac 



gctgcaggtc gactctagat tacaactaat 60 
cccgggattg gccaaaggac ccaaaggtat 120 
attttcagga ggacccttgg ctagcgtcga 180 
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cggtaccatg gggcgcgcca tgagtaaagg 
tcttgttgaa ttagatggtg atgttaatgg 
aggtgatgca acatacggaa aacttaccct 
tgttccatgg gtaagtttaa acatatatat 
gccaacactt gtcactactt tctgttatgg 
tatgaaacgg catgactttt tcaagagtgc 
tatatttttc aaagatgacg ggaactacaa 
ctaaccatac atatttaaat tttcaggtgc 
taatagaatc gagttaaaag gtattgattt 
attggaatac aactataact cacacaatgt 
aatcaaagtt gtaagtttaa acttggactt 
gaacttcaaa attagacaca acattgaaga 
acaaaatact ccaattggcg atggccctgt 
acaatctgcc ctttcgaaag atcccaacga 
tgtaacagct gctgggatta cacatggcat 
ctccgcatcg gccgctgtca tcagatcgcc 
ccaattactc ttcaacatcc ctacatgctc 
gttattatca aaaaaacttc ttcttaattt 
ctaacaatga aattgtgtag attcaaaaat 
aattgtgctc cctcccccca ttaataataa 
gtacacttct tatgtttttt ttacttctga 
ccgcacacaa aataccttat catatgttac 
ttcgcacgtc tgggcctctc atgacgtcaa 
atttttggaa tttttcaatc aagtgaaagt 
ttgggggttt cccctattgt ttgtcaagag 
cacaagtatt gatgagcacg atgcaagaaa 
gtggaaggtg agtagaagtt gataatttga 
ttaaatgaca gaatacattc ccaatatacc 
acgggccctt tcgtctcgcg cgtttcggtg 
tcccggagac ggtcacagct tgtctgtaag 
gcgcgtcagc gggtgttggc gggtgtcggg 
ttgtactgag agtgcaccat atgcggtgtg 
accgcatcag gcggccttaa gggcctcgtg 
ataataatgg tttcttagac gtcaggtggc 
atttgtttat ttttctaaat acattcaaat 
taaatgcttc aataatattg aaaaaggaag 
cttattccct tttttgcggc attttgcctt 
aaagtaaaag atgctgaaga tcagttgggt 
aacagcggta agatccttga gagttttcgc 
tttaaagttc tgctatgtgg cgcggtatta 
ggtcgccgca tacactattc tcagaatgac 
catcttacgg atggcatgac agtaagagaa 
aacactgcgg ccaacttact tctgacaacg 
ttgcacaaca tgggggatca tgtaactcgc 
gccataccaa acgacgagcg tgacaccacg 
aaactattaa ctggcgaact acttactcta 
gaggcggata aagttgcagg accacttctg 
gctgataaat ctggagccgg tgagcgtggg 
gatggtaagc cctcccgtat cgtagttatc 
gaacgaaata gacagatcgc tgagataggt 
gaccaagttt actcatatat actttagatt 
atctaggtga agatcctttt tgataatctc 
ttccactgag cgtcagaccc cgtagaaaag 
ctgcgcgtaa tctgctgctt gcaaacaaaa 
ccggatcaag agctaccaac tctttttccg 
ccaaatactg tccttctagt gtagccgtag 
ccgcctacat acctcgctct gctaatcctg 
tcgtgtctta ccgggttgga ctcaagacga 
tgaacggggg gttcgtgcac acagcccagc 
tacctacagc gtgagcattg agaaagcgcc 
tatccggtaa gcggcagggt cggaacagga 



agaagaactt ttcactggag ttgtcccaat 240 
gcacaaattt tctgtcagtg gagagggtga 300 
taaatttatt tgcactactg gaaaactacc 360 
actaactaac cctgattatt taaattttca 420 
tgttcaatgc ttctcgagat acccagatca 480 
catgcccgaa ggttatgtac aggaaagaac 540 
gacacgtaag tttaaacagt tcggtactaa 600 
tgaagtcaag tttgaaggtg atacccttgt 660 
taaagaagat ggaaacattc ttggacacaa 720 
atacatcatg gcagacaaac aaaagaatgg 780 
actaactaac ggattatatt taaattttca 84 0 
tggaagcgtt caactagcag accattatca 900 
ccttttacca gacaaccatt acctgtccac 960 
aaagagagac cacatggtcc ttcttgagtt 1020 
ggatgaacta tacaaatagg gccggccgag 1080 
atctcgcgcc cgtgcctctg acttctaagt 1140 
tttctccctg tgctcccacc ccctattttt 1200 
ctttgttttt tagcttcttt taagtcacct 1260 
agaattaatt cgtaataaaa agtcgaaaaa 1320 
ttctatccca aaatctacac aatgttctgt 1380 
taaatttttt ttgaaacatc atagaaaaaa 1440 
gtttcagttt atgaccgcaa tttttatttc 1500 
atcatgctca tcgtgaaaaa gttttggagt 1560 
ttatgaaatt aattttcctg cttttgcttt 1620 
tttcgaggac ggcgtttttc ttgctaaaat 1680 
gatcggaaga aggtttgggt ttgaggctca 1740 
aagtggagta gtgtctatgg ggtttttgcc 1800 
aaacataact gtttcctact agtcggccgt 18 60 
atgacggtga aaacctctga cacatgcagc 1920 
cggatgccgg gagcagacaa gcccgtcagg 1980 
gctggcttaa ctatgcggca tcagagcaga 2040 
aaataccgca cagatgcgta aggagaaaat 2100 
atacgcctat ttttataggt taatgtcatg 2160 
acttttcggg gaaatgtgcg cggaacccct 2220 
atgtatccgc tcatgagaca ataaccctga 2280 
agtatgagta ttcaacattt ccgtgtcgcc 2340 
cctgtttttg ctcacccaga aacgctggtg 2400 
gcacgagtgg gttacatcga actggatctc 24 60 
cccgaagaac gttttccaat gatgagcact 2520 
tcccgtattg acgccgggca agagcaactc 2580 
ttggttgagt actcaccagt cacagaaaag 2640 
ttatgcagtg ctgccataac catgagtgat 2700 
atcggaggac cgaaggagct aaccgctttt 2760 
cttgatcgtt gggaaccgga gctgaatgaa 2820 
atgcctgtag caatggcaac aacgttgcgc 2880 
gcttcccggc aacaattaat agactggatg 2940 
cgctcggccc ttccggctgg ctggtttatt 3000 
tctcgcggta tcattgcagc actggggcca 3060 
tacacgacgg ggagtcaggc aactatggat 3120 
gcctcactga ttaagcattg gtaactgtca 3180 
gatttaaaac ttcattttta atttaaaagg 3240 
atgaccaaaa tcccttaacg tgagttttcg 3300 
atcaaaggat cttcttgaga tccttttttt 3360 
aaaccaccgc taccagcggt ggtttgtttg 3420 
aaggtaactg gcttcagcag agcgcagata 34 80 
ttaggccacc acttcaagaa ctctgtagca 3540 
ttaccagtgg ctgctgccag tggcgataag 3600 
tagttaccgg ataaggcgca gcggtcgggc 3660 
ttggagcgaa cgacctacac cgaactgaga 3720 
acgcttcccg aagggagaaa ggcggacagg 3780 
gagcgcacga gggagcttcc agggggaaac 38 40 
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gcctggtatc tttatagtcc tgtcgggttt cgccacctct gacttgagcg tcgatttttg 3900 
tgatgctcgt caggggggcg gagcctatgg aaaaacgcca gcaacgcggc ctttttacgg 3960 
ttcctggcct tttgctggcc ttttgctcac atgttctttc ctgcgttatc ccctgattct 4020 
gtggataacc gtattaccgc ctttgagtga gctgataccg ctcgccgcag ccgaacgacc 4080 
gagcgcagcg agtcagtgag cgaggaagcg gaagagcgcc caatacgcaa accgcctctc 4140 
cccgcgcgtt ggccgattca ttaatgcagc tggcacgaca ggtttcccga ctggaaagcg 4200 
ggcagtgaga gcaacgcaat taatgtgagt tagctcactc attaggcacc ccaggcttta 4260 
cactttatgc ttccggctcg tatgttgtgt ggaattgtga gcggataaca atttcacaca 4320 
ggaaacagct atgaccatga ttacgcca 434 8 

<210> 5 
<211> 9309 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: plasmid pGN148 
<400> 5 

atgactgctc caaagaagaa gcgtaaggta ccggtaatga acacgattaa catcgctaag 60 
aacgacttct ctgacatcga actggctgct atcccgttca acactctggc tgaccattac 120 
ggtgagcgtt tagctcggta agtttaaaca tctagatact aactaacgat taacatttaa 180 
attttcagcg aacagttggc ccttgagcat gagtcttacg agatgggtga agcacgcttc 24 0 
cgcaagatgt ttgagcgtca acttaaagct ggtgaggttg cggataacgc tgccgccaag 300 
cctctcatca ctaccctact ccctaagatg attgcacgca tcaacgactg gtttgaggaa 360 
gtgaaagcta agcgcggcaa gcgcccgaca gccttccagt tcctgcaaga aatcaagccg 420 
gaagccgtag cgtacatcac cattaagacc actctggctt gcctaaccag tgctgacaat 480 
acaaccgttc aggctgtagc aagcgcaatc ggtcgggcca ttgaggacga ggctcgcttc 540 
ggtcgtatcc gtgaccttga agctaagcac ttcaagaaaa acgttgagga acaactcaac 600 
aagcgcgtag ggcacgtcta caagaaagca tttatgcaag ttgtcgaggc tgacatgctc 660 
tctaagggtc tactcggtgg cgaggcgtgg tcttcgtggc ataaggaaga ctctattcat 720 
gtaggagtac gctgcatcga gatgctcatt gagtcaaccg gagtggttag cttacaccgc 780 
caaaatgctg gcgtagtagg tcaagactct gagactatcg aactcgcacc tgaatacgct 840 
gaggctatcg caacccgtgc aggtgcgctg gctggcatct ctccgatgtt ccaaccttgc 900 
gtagttcctc ctaagccgtg gactggcatt actggtggtg gctattgggc taacggtcgt 960 
cgtcctctgg cgctggtgcg tactcacagt aagaaagcac tgatgcgcta cgaagacgtt 1020 
tacatgcctg aggtgtacaa agcgattaac attgcgcaaa acaccgcatg gaaaatcaac 1080 
aagaaagtcc tagcggtcgc caacgtaatc accaagtgga agcattgtcc ggtcgaggac 1140 
atccctgcga ttgagcgtga agaactcccg atgaaaccgg aagacatcga catgaatcct 1200 
gaggctctca ccgcgtggaa acgtgctgcc gctgctgtgt accgcaagga caaggctcgc 1260 
aagtctcgcc gtatcagcct tgagttcatg cttgagcaag ccaataagtt tgctaaccat 1320 
aaggccatct ggttccctta caacatggac tggcgcggtc gtgtttacgc tgtgtcaatg 1380 
ttcaacccgc aagctaacga tatgaccaaa ggactgctta cgctggcgaa aggtaaacca 14 40 
atcggtaagg aaggttacta ctggctgaaa atccacggtg caaactgtgc gggtgtcgat 1500 
aaggttccgt tccctgagcg catcaagttc attgaggaaa accacgagaa catcatggct 1560 
tgcgctaagt ctccactgga gaacacttgg tgggctgagc aagattctcc gttctgcttc 1620 
cttgcgttct gctttgagta cgctggggta cagcaccacg gcctgagcta taactgctcc 1680 
cttccgctgg cgtttgacgg gtcttgctct ggcatccagc acttctccgc gatgctccga 1740 
gatgaggtag gtggtcgcgc ggttgtaagt ttaaactcta tcctactaac taacgaagct 1800 
tatttaaatt ttcagaactt gcttcctagt gaaaccgttc aggacatcta cgggattgtt 1860 
gctaagaaag tcaacgagat tctacaagca gacgcaatca atgggaccga taacgaagta 1920 
gttaccgtga ccgatgagaa cactggtgaa atctctgaga aagtcaagct gggcactaag 1980 
gcactggctg gtcaatggct ggcttacggt gttactcgca gtgtgactaa gcgttcagtc 2040 
atgacgctgg cttacgggtc caaagagttc ggcttccgtc aacaagtgct ggaagatacc 2100 
attcagccag ctattgattc cggcaagggt ctgatgttca ctcagccgaa tcaggctgct 2160 
ggatacatgg ctaagctgat ttgggaatct gtgagcgtga cggtggtagc tgcggttgaa 2220 
gcaatgaact ggcttaagtc tgctgctaag ctgctggctg ctgaggtcaa agataagaag 2280 
actggagaga ttcttcgcaa gcgttgcgct gtgcattggg tcactccgga tggtttccct 234 0 
gtgtggcagg aatacaagaa gcctattcaa acgcgtttga acctgatgtt cctcggtcag 2400 
ttccgcttac agcctaccat taacaccaac aaagatagcg agattgatgc acacaaacag 24 60 
gagtctggta tcgctcctaa ctttgtacac agccaagacg gtagccacct tcgtaagact 2520 



WO 01/881 14 



PCT/EP01/05794 



gtagtgtggg cacacgagaa gtacggaatc 
ggtaccattc cggctgacgc tgcgaacctg 
acatatgagt cttgtgatgt actggctgat 
gagtctcaat tggacaaaat gccagcactt 
atcttagagt cggacttcgc gttcgcgtaa 
taccaacttg tctggtgtca aaaataatag 
gagttctact aactaacgag taatatttaa 
cttctaagtc caattactct tcaacatccc 
cctatttttg ttattatcaa aaaaacttct 
aagtcacctc taacaatgaa attgtgtaga 
gtcgaaaaaa attgtgctcc ctccccccat 
atgttctgtg tacacttctt atgttttttt 
tagaaaaaac cgcacacaaa ataccttatc 
ttttatttct tcgcacgtct gggcctctca 
ttttggagta tttttggaat ttttcaatca 
ttttgctttt tgggggtttc ccctattgtt 
tgctaaaatc acaagtattg atgagcacga 
tgaggctcag tggaaggtga gtagaagttg 
gtttttgcct taaatgacag aatacattcc 
gtcggccgta cgggcccttt cgtctcgcgc 
acatgcagct cccggagacg gtcacagctt 
cccgtcaggg cgcgtcagcg ggtgttggcg 
cagagcagat tgtactgaga gtgcaccata 
ggagaaaata ccgcatcagg cggccttaag 
aatgtcatga taataatggt ttcttagacg 
ggaaccccta tttgtttatt tttctaaata 
taaccctgat aaatgcttca ataatattga 
cgtgtcgccc ttattccctt ttttgcggca 
acgctggtga aagtaaaaga tgctgaagat 
ctggatctca acagcggtaa gatccttgag 
atgagcactt ttaaagttct gctatgtggc 
gagcaactcg gtcgccgcat acactattct 
acagaaaagc atcttacgga tggcatgaca 
atgagtgata acactgcggc caacttactt 
accgcttttt tgcacaacat gggggatcat 
ctgaatgaag ccataccaaa cgacgagcgt 
acgttgcgca aactattaac tggcgaacta 
gactggatgg aggcggataa agttgcagga 
tggbttattg ctgataaatc tggagccggt 
ctggggccag atggtaagcc ctcccgtatc 
actatggatg aacgaaatag acagatcgct 
taactgtcag accaagttta ctcatatata 
tttaaaagga tctaggtgaa gatccttttt 
gagttttcgt tccactgagc gtcagacccc 
cctttttttc tgcgcgtaat ctgctgcttg 
gtttgtttgc cggatcaaga gctaccaact 
gcgcagatac caaatactgt ccttctagtg 
tctgtagcac cgcctacata cctcgctctg 
ggcgataagt cgtgtcttac cgggttggac 
cggtcgggct gaacgggggg ttcgtgcaca 
gaactgagat acctacagcg tgagcattga 
gcggacaggt atccggtaag cggcagggtc 
gggggaaacg cctggtatct ttatagtcct 
cgatttttgt gatgctcgtc aggggggcgg 
tttttacggt tcctggcctt ttgctggcct 
cctgattctg tggataaccg tattaccgcc 
cgaacgaccg agcgcagcga gtcagtgagc 
ccgcctctcc ccgcgcgttg gccgattcat 
tggaaagcgg gcagtgagcg caacgcaatt 
caggctttac actttatgct tccggctcgt 
tttcacacag gaaacagcta tgaccatgat 



gaatcttttg cactgattca cgactccttc 2580 
ttcaaagcag tgcgcgaaac tatggttgac 2640 
ttctacgacc agttcgctga ccagttgcac 2700 
ccggctaaag gtaacttgaa cctccgtgac 2760 
gaattccaac tgagcgccgg tcgctaccat 2820 
gggccgctgt catcagagta agtttaaact 2880 
attttcagca tctcgcgccc gtgcctctga 2940 
tacatgctct ttctccctgt gctcccaccc 3000 
tcttaatttc tttgtttttt agcttctttt 3060 
ttcaaaaata gaattaattc gtaataaaaa 3120 
taataataat tctatcccaa aatctacaca 3180 
tacttctgat aaattttttt tgaaacatca 3240 
atatgttacg tttcagttta tgaccgcaat 3300 
tgacgtcaaa tcatgctcat cgtgaaaaag 3360 
agtgaaagtt tatgaaatta attttcctgc 3420 
tgtcaagagt ttcgaggacg gcgtttttct 3480 
tgcaagaaag atcggaagaa ggtttgggtt 3540 
ataatttgaa agtggagtag tgtctatggg 3600 
caatatacca aacataactg tttcctacta 3660 
gtttcggtga tgacggtgaa aacctctgac 3720 
gtctgtaagc ggatgccggg agcagacaag 3780 
ggtgtcgggg ctggcttaac tatgcggcat 3840 
tgcggtgtga aataccgcac agatgcgtaa 3900 
ggcctcgtga tacgcctatt tttataggtt 3960 
tcaggtggca cttttcgggg aaatgtgcgc 4020 
cattcaaata tgtatccgct catgagacaa 4080 
aaaaggaaga gtatgagtat tcaacatttc 4140 
ttttgccttc ctgtttttgc- tcacccagaa 4200 
cagttgggtg cacgagtggg ttacatcgaa 4260 
agttttcgcc ccgaagaacg ttttccaatg 4320 
gcggtattat cccgtattga cgccgggcaa 4380 
cagaatgact tggttgagta ctcaccagtc 4440 
gtaagagaat tatgcagtgc tgccataacc 4500 
ctgacaacga tcggaggacc gaaggagcta 4560 
gtaactcgcc ttgatcgttg ggaaccggag 4 620 
gacaccacga tgcctgtagc aatggcaaca 4 680 
cttactctag cttcccggca acaattaata 4740 
ccacttctgc gctcggccct tccggctggc 4 BOO 
gagcgtgggt ctcgcggtat cattgcagca 4860 
gtagttatct acacgacggg gagtcaggca 4 920 
gagataggtg cctcactgat taagcattgg 4 980 
ctttagattg atttaaaact tcatttttaa 5040 
gataatctca tgaccaaaat cccttaacgt 5100 
gtagaaaaga tcaaaggatc ttcttgagat 5160 
caaacaaaaa aaccaccgct accagcggtg 5220 
ctttttccga aggtaactgg cttcagcaga 5280 
tagccgtagt taggccacca cttcaagaac 5340 
ctaatcctgt taccagtggc tgctgccagt 5400 
tcaagacgat agttaccgga taaggcgcag 54 60 
cagcccagct tggagcgaac gacctacacc 5520 
gaaagcgcca cgcttcccga agggagaaag 5580 
ggaacaggag agcgcacgag ggagcttcca 5640 
gtcgggtttc gccacctctg acttgagcgt 5700 
agcctatgga aaaacgccag caacgcggcc 5760 
tttgctcaca tgttctttcc tgcgttatcc 5820 
tttgagtgag ctgataccgc tcgccgcagc 5880 
gaggaagcgg aagagcgccc aatacgcaaa 5940 
taatgcagct ggcacgacag gtttcccgac 6000 
aatgtgagtt agctcactca ttaggcaccc 6060 
atgttgtgtg gaattgtgag cggataacaa 6120 
tacgccaagc tgtaagttta aacatgatct 6180 
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tactaactaa ctattctcat ttaaattttc 

acgatggata cgctaacaac ttggaaatga 

atactgcttt tccttgcaaa attcggtgct 

cgagcatttc cttctttgac ttctctcttt 

atttgattac acacactcag agttcttcga 

taacagtaca tgacaagggc ggattattat 

gtgctcattt tggagccccg tgtggggcag 

gagggggacc ttatcgaacg tcgcatgagc 

tctctcacac attaatcgat tcatagactc 

agcttttttt cccgatttgt aaaaggaaga 

gccgaaaaaa catccgtagt aagfccttcct 

ccgctagttt tgaagtttaa attttaaaaa 

ctaaataggc aaaagttttt tcaagaactc 

gaaaaattct tgttttaaat ttaatattta 

gaaaattctc aattaaaaga atttgccgat 

agctcaaatt agaagtttgt ttttaaagga 

ttcctccccg cgtaaaatta gttgttgtga 

gctcttcaca ccgtgcttcc tctcacttga 

gacggtgaat tgccttatca agagcgtcgt 

tttctttact ttatatttat gtaggtcaca 

ttgtctccac accacctccg cctccagcag 

ttccttgtat gccgcggtcg tgaatgcacc 

actttcggtg gtattactac acgtcatcgt 

ctcactcctc attattcccc ttggtgtatt 

cgtttctacc ttcttgtttt ccgtccattt 

taggccagag agttctagtt cttgttctaa 

ctcatcaaaa aaaaagttga taaagaaaac 

aattgattca aaaaaaaatt cccacccctc 

ttttttcctt ctcgattcaa ttttttgctg 

gcaaaagatg agagaattta caaacagaaa 

tatccgagat tgatgggaaa gatattaatg 

agattttcgc atcaaactcc gccttttaca 

catcattttt aatagaaaat cgagaaatta 

aca'cgaaatt tgtgggtctc gccacgatct 

ttataaaatt tcaaattcta gtgtttaatt 

gtcatcattt tcaacaagaa atcgtgaaaa 

atacacgaaa tatatggtaa tttcccgaaa 

agtggccagg atttatcacg aaaaaagttc 

ctaaattagt tttttgtcat cattttgaac 

attttcgtcg cttttctctc caaaaatgac 

accatgatat cttttctccc cgtttttcat 

ggtcaccacc accgccagtg tgtgccatat 

tttatgtgtt atccctgctt gatttcgttc 

agccgagaag ctccagagaa tggagcacac 

tggcgccgcc tgggaagccg ccgagagata 

tgccacctca tctctcgagt ttctccgcct 

ttgtcgtaaa gagatgtttt tattttactt 

gctcagtgtt ggctgtgtgc tcgggctcct 

ttctctcctg ctctcgctta tcacttcttc 

tagcahttct tactttattt atttttttca 

ttgggttaca gccgtcaaca gatccccggg 

aatgatacta acataacata gaacattttc 
tcagaaaaa 



agagcttaaa aatggctgaa atcactcaca 6240 
aataagcttg catgcctgca gagcaaaaaa 6300 
ttcttcaaag agaaactttt gaagtcggcg 6360 
ccgccaaaaa gcctagcatt tttattgata 6420 
catgataaag tgtttcattg gcactcgccc 6480 
cgatcgatat tgaagacaaa ctccaaatgt 6540 
ctgctctcaa tatattacta gggagacgag 6600 
cattctttct tctttatgca ctctcttcac 6660 
ccatattcct tgatgaaggt gtgggttttt 6720 
ggctgacgat gttaggaaaa agagaacgga 6780 
tttaagccga cactttttag acagcattcg 6840 
ataaaaatta gtttcaattt tttttaatta 6900 
tagaaaaact agcttaattc atgggtacta 6960 
tcttaagatg taattacgag aagcttttth 7020 
ttagaataaa agtcttcaga aatgagtaaa 7080 
aaaacacgaa aaaagaacac tatttatctt 7140 
taatagtgat ccgctgtcta tttgcactcg 7200 
cccaacagga aaaaaaaaca tcacgtctga 7260 
ctctttcacc cagtaacaaa aaaaatttgg 7320 
aaaaaaaagt gatgcagttt tgtgggtcgg 7380 
cacacaatca tcttcgtgtg ttctcgacga 74 40 
acattcgacg cgcaactaca caccacactc 7500 
tgttcgtagt ctcccgctct ttcgtcccca 7560 
gatttttttt aaatggtaca ccactcctga 7620 
agattttatc tggaaatttt tttaaaattt 7680 
aagtctaggt cagacataca ttttctattt 77 40 
tggttattca gaaagagtgt gtctcgttga 7800 
gcttgtttct caaaatatga gatcaacgga 78 60 
cgctctgtct gccaaagtgt gtgtgtccga 7920 
tgaaaaaaag ttggccaaat aatgaagttt 7980 
ttctttacgg tttggagggg agagagagat 8040 
tgtcttttag aatctaaaat agatttttct 9100 
cagtaatttc gcaattttct tgccaaaaat 8160 
cggtcttagt ggttcatttg gtttaaaagt 8220 
tccgcataat tggacctaaa atgggttttt 8280 
tcctgttgtt tcgcaatttt cttttcaaaa 8340 
tattgagggt ctcgccacga tttcagtcac 8400 
gcctagtctc acatttccgg aaaaccgaat 84 60 
aaaaaatcga gacatcccta tagtttcgca 8520 
agtctagaat taaaattcgc tggaactggg 8580 
tttatttttt attacactgg attgactaaa 8640 
cacacacaca cacacacaca atgtcgagat 8700 
cgttgtctct ctctctctat tcatcttttg 8760 
aggatcccgg cgcgcgatgt cgtcgggaga 8820 
tcagggaaga tcgtctgatt tctcctcgga 8880 
gttactccct gccgaacctg atatttcccg 8940 
tacaccgggt cctctctctc tgccagcaca 9000 
gccaccggcg gcctcatctt cttcttcttc 9060 
attcattctt attccttttc atcatcaaac 9120 
attttcaatt ttcagataaa accaaactac 9180 
attggccaaa ggacccaaag gtatgtttcg 9240 
aggaggaccc ttgcttggag ggtaccgagc 9300 
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